Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
freebsd
GitHub Repository: freebsd/freebsd-src
Path: blob/main/contrib/llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyLowerEmscriptenEHSjLj.cpp
35266 views
1
//=== WebAssemblyLowerEmscriptenEHSjLj.cpp - Lower exceptions for Emscripten =//
2
//
3
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4
// See https://llvm.org/LICENSE.txt for license information.
5
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6
//
7
//===----------------------------------------------------------------------===//
8
///
9
/// \file
10
/// This file lowers exception-related instructions and setjmp/longjmp function
11
/// calls to use Emscripten's library functions. The pass uses JavaScript's try
12
/// and catch mechanism in case of Emscripten EH/SjLj and Wasm EH intrinsics in
13
/// case of Emscripten SjLJ.
14
///
15
/// * Emscripten exception handling
16
/// This pass lowers invokes and landingpads into library functions in JS glue
17
/// code. Invokes are lowered into function wrappers called invoke wrappers that
18
/// exist in JS side, which wraps the original function call with JS try-catch.
19
/// If an exception occurred, cxa_throw() function in JS side sets some
20
/// variables (see below) so we can check whether an exception occurred from
21
/// wasm code and handle it appropriately.
22
///
23
/// * Emscripten setjmp-longjmp handling
24
/// This pass lowers setjmp to a reasonably-performant approach for emscripten.
25
/// The idea is that each block with a setjmp is broken up into two parts: the
26
/// part containing setjmp and the part right after the setjmp. The latter part
27
/// is either reached from the setjmp, or later from a longjmp. To handle the
28
/// longjmp, all calls that might longjmp are also called using invoke wrappers
29
/// and thus JS / try-catch. JS longjmp() function also sets some variables so
30
/// we can check / whether a longjmp occurred from wasm code. Each block with a
31
/// function call that might longjmp is also split up after the longjmp call.
32
/// After the longjmp call, we check whether a longjmp occurred, and if it did,
33
/// which setjmp it corresponds to, and jump to the right post-setjmp block.
34
/// We assume setjmp-longjmp handling always run after EH handling, which means
35
/// we don't expect any exception-related instructions when SjLj runs.
36
/// FIXME Currently this scheme does not support indirect call of setjmp,
37
/// because of the limitation of the scheme itself. fastcomp does not support it
38
/// either.
39
///
40
/// In detail, this pass does following things:
41
///
42
/// 1) Assumes the existence of global variables: __THREW__, __threwValue
43
/// __THREW__ and __threwValue are defined in compiler-rt in Emscripten.
44
/// These variables are used for both exceptions and setjmp/longjmps.
45
/// __THREW__ indicates whether an exception or a longjmp occurred or not. 0
46
/// means nothing occurred, 1 means an exception occurred, and other numbers
47
/// mean a longjmp occurred. In the case of longjmp, __THREW__ variable
48
/// indicates the corresponding setjmp buffer the longjmp corresponds to.
49
/// __threwValue is 0 for exceptions, and the argument to longjmp in case of
50
/// longjmp.
51
///
52
/// * Emscripten exception handling
53
///
54
/// 2) We assume the existence of setThrew and setTempRet0/getTempRet0 functions
55
/// at link time. setThrew exists in Emscripten's compiler-rt:
56
///
57
/// void setThrew(uintptr_t threw, int value) {
58
/// if (__THREW__ == 0) {
59
/// __THREW__ = threw;
60
/// __threwValue = value;
61
/// }
62
/// }
63
//
64
/// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
65
/// In exception handling, getTempRet0 indicates the type of an exception
66
/// caught, and in setjmp/longjmp, it means the second argument to longjmp
67
/// function.
68
///
69
/// 3) Lower
70
/// invoke @func(arg1, arg2) to label %invoke.cont unwind label %lpad
71
/// into
72
/// __THREW__ = 0;
73
/// call @__invoke_SIG(func, arg1, arg2)
74
/// %__THREW__.val = __THREW__;
75
/// __THREW__ = 0;
76
/// if (%__THREW__.val == 1)
77
/// goto %lpad
78
/// else
79
/// goto %invoke.cont
80
/// SIG is a mangled string generated based on the LLVM IR-level function
81
/// signature. After LLVM IR types are lowered to the target wasm types,
82
/// the names for these wrappers will change based on wasm types as well,
83
/// as in invoke_vi (function takes an int and returns void). The bodies of
84
/// these wrappers will be generated in JS glue code, and inside those
85
/// wrappers we use JS try-catch to generate actual exception effects. It
86
/// also calls the original callee function. An example wrapper in JS code
87
/// would look like this:
88
/// function invoke_vi(index,a1) {
89
/// try {
90
/// Module["dynCall_vi"](index,a1); // This calls original callee
91
/// } catch(e) {
92
/// if (typeof e !== 'number' && e !== 'longjmp') throw e;
93
/// _setThrew(1, 0); // setThrew is called here
94
/// }
95
/// }
96
/// If an exception is thrown, __THREW__ will be set to true in a wrapper,
97
/// so we can jump to the right BB based on this value.
98
///
99
/// 4) Lower
100
/// %val = landingpad catch c1 catch c2 catch c3 ...
101
/// ... use %val ...
102
/// into
103
/// %fmc = call @__cxa_find_matching_catch_N(c1, c2, c3, ...)
104
/// %val = {%fmc, getTempRet0()}
105
/// ... use %val ...
106
/// Here N is a number calculated based on the number of clauses.
107
/// setTempRet0 is called from __cxa_find_matching_catch() in JS glue code.
108
///
109
/// 5) Lower
110
/// resume {%a, %b}
111
/// into
112
/// call @__resumeException(%a)
113
/// where __resumeException() is a function in JS glue code.
114
///
115
/// 6) Lower
116
/// call @llvm.eh.typeid.for(type) (intrinsic)
117
/// into
118
/// call @llvm_eh_typeid_for(type)
119
/// llvm_eh_typeid_for function will be generated in JS glue code.
120
///
121
/// * Emscripten setjmp / longjmp handling
122
///
123
/// If there are calls to longjmp()
124
///
125
/// 1) Lower
126
/// longjmp(env, val)
127
/// into
128
/// emscripten_longjmp(env, val)
129
///
130
/// If there are calls to setjmp()
131
///
132
/// 2) In the function entry that calls setjmp, initialize
133
/// functionInvocationId as follows:
134
///
135
/// functionInvocationId = alloca(4)
136
///
137
/// Note: the alloca size is not important as this pointer is
138
/// merely used for pointer comparisions.
139
///
140
/// 3) Lower
141
/// setjmp(env)
142
/// into
143
/// __wasm_setjmp(env, label, functionInvocationId)
144
///
145
/// __wasm_setjmp records the necessary info (the label and
146
/// functionInvocationId) to the "env".
147
/// A BB with setjmp is split into two after setjmp call in order to
148
/// make the post-setjmp BB the possible destination of longjmp BB.
149
///
150
/// 4) Lower every call that might longjmp into
151
/// __THREW__ = 0;
152
/// call @__invoke_SIG(func, arg1, arg2)
153
/// %__THREW__.val = __THREW__;
154
/// __THREW__ = 0;
155
/// %__threwValue.val = __threwValue;
156
/// if (%__THREW__.val != 0 & %__threwValue.val != 0) {
157
/// %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
158
/// if (%label == 0)
159
/// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
160
/// setTempRet0(%__threwValue.val);
161
/// } else {
162
/// %label = -1;
163
/// }
164
/// longjmp_result = getTempRet0();
165
/// switch %label {
166
/// label 1: goto post-setjmp BB 1
167
/// label 2: goto post-setjmp BB 2
168
/// ...
169
/// default: goto splitted next BB
170
/// }
171
///
172
/// __wasm_setjmp_test examines the jmp buf to see if it was for a matching
173
/// setjmp call. After calling an invoke wrapper, if a longjmp occurred,
174
/// __THREW__ will be the address of matching jmp_buf buffer and
175
/// __threwValue be the second argument to longjmp.
176
/// __wasm_setjmp_test returns a setjmp label, a unique ID to each setjmp
177
/// callsite. Label 0 means this longjmp buffer does not correspond to one
178
/// of the setjmp callsites in this function, so in this case we just chain
179
/// the longjmp to the caller. Label -1 means no longjmp occurred.
180
/// Otherwise we jump to the right post-setjmp BB based on the label.
181
///
182
/// * Wasm setjmp / longjmp handling
183
/// This mode still uses some Emscripten library functions but not JavaScript's
184
/// try-catch mechanism. It instead uses Wasm exception handling intrinsics,
185
/// which will be lowered to exception handling instructions.
186
///
187
/// If there are calls to longjmp()
188
///
189
/// 1) Lower
190
/// longjmp(env, val)
191
/// into
192
/// __wasm_longjmp(env, val)
193
///
194
/// If there are calls to setjmp()
195
///
196
/// 2) and 3): The same as 2) and 3) in Emscripten SjLj.
197
/// (functionInvocationId initialization + setjmp callsite transformation)
198
///
199
/// 4) Create a catchpad with a wasm.catch() intrinsic, which returns the value
200
/// thrown by __wasm_longjmp function. In the runtime library, we have an
201
/// equivalent of the following struct:
202
///
203
/// struct __WasmLongjmpArgs {
204
/// void *env;
205
/// int val;
206
/// };
207
///
208
/// The thrown value here is a pointer to the struct. We use this struct to
209
/// transfer two values by throwing a single value. Wasm throw and catch
210
/// instructions are capable of throwing and catching multiple values, but
211
/// it also requires multivalue support that is currently not very reliable.
212
/// TODO Switch to throwing and catching two values without using the struct
213
///
214
/// All longjmpable function calls will be converted to an invoke that will
215
/// unwind to this catchpad in case a longjmp occurs. Within the catchpad, we
216
/// test the thrown values using __wasm_setjmp_test function as we do for
217
/// Emscripten SjLj. The main difference is, in Emscripten SjLj, we need to
218
/// transform every longjmpable callsite into a sequence of code including
219
/// __wasm_setjmp_test() call; in Wasm SjLj we do the testing in only one
220
/// place, in this catchpad.
221
///
222
/// After testing calling __wasm_setjmp_test(), if the longjmp does not
223
/// correspond to one of the setjmps within the current function, it rethrows
224
/// the longjmp by calling __wasm_longjmp(). If it corresponds to one of
225
/// setjmps in the function, we jump to the beginning of the function, which
226
/// contains a switch to each post-setjmp BB. Again, in Emscripten SjLj, this
227
/// switch is added for every longjmpable callsite; in Wasm SjLj we do this
228
/// only once at the top of the function. (after functionInvocationId
229
/// initialization)
230
///
231
/// The below is the pseudocode for what we have described
232
///
233
/// entry:
234
/// Initialize functionInvocationId
235
///
236
/// setjmp.dispatch:
237
/// switch %label {
238
/// label 1: goto post-setjmp BB 1
239
/// label 2: goto post-setjmp BB 2
240
/// ...
241
/// default: goto splitted next BB
242
/// }
243
/// ...
244
///
245
/// bb:
246
/// invoke void @foo() ;; foo is a longjmpable function
247
/// to label %next unwind label %catch.dispatch.longjmp
248
/// ...
249
///
250
/// catch.dispatch.longjmp:
251
/// %0 = catchswitch within none [label %catch.longjmp] unwind to caller
252
///
253
/// catch.longjmp:
254
/// %longjmp.args = wasm.catch() ;; struct __WasmLongjmpArgs
255
/// %env = load 'env' field from __WasmLongjmpArgs
256
/// %val = load 'val' field from __WasmLongjmpArgs
257
/// %label = __wasm_setjmp_test(%env, functionInvocationId);
258
/// if (%label == 0)
259
/// __wasm_longjmp(%env, %val)
260
/// catchret to %setjmp.dispatch
261
///
262
///===----------------------------------------------------------------------===//
263
264
#include "MCTargetDesc/WebAssemblyMCTargetDesc.h"
265
#include "WebAssembly.h"
266
#include "WebAssemblyTargetMachine.h"
267
#include "llvm/ADT/StringExtras.h"
268
#include "llvm/CodeGen/TargetPassConfig.h"
269
#include "llvm/CodeGen/WasmEHFuncInfo.h"
270
#include "llvm/IR/DebugInfoMetadata.h"
271
#include "llvm/IR/Dominators.h"
272
#include "llvm/IR/IRBuilder.h"
273
#include "llvm/IR/IntrinsicsWebAssembly.h"
274
#include "llvm/IR/Module.h"
275
#include "llvm/Support/CommandLine.h"
276
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
277
#include "llvm/Transforms/Utils/Local.h"
278
#include "llvm/Transforms/Utils/SSAUpdater.h"
279
#include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
280
#include <set>
281
282
using namespace llvm;
283
284
#define DEBUG_TYPE "wasm-lower-em-ehsjlj"
285
286
static cl::list<std::string>
287
EHAllowlist("emscripten-cxx-exceptions-allowed",
288
cl::desc("The list of function names in which Emscripten-style "
289
"exception handling is enabled (see emscripten "
290
"EMSCRIPTEN_CATCHING_ALLOWED options)"),
291
cl::CommaSeparated);
292
293
namespace {
294
class WebAssemblyLowerEmscriptenEHSjLj final : public ModulePass {
295
bool EnableEmEH; // Enable Emscripten exception handling
296
bool EnableEmSjLj; // Enable Emscripten setjmp/longjmp handling
297
bool EnableWasmSjLj; // Enable Wasm setjmp/longjmp handling
298
bool DoSjLj; // Whether we actually perform setjmp/longjmp handling
299
300
GlobalVariable *ThrewGV = nullptr; // __THREW__ (Emscripten)
301
GlobalVariable *ThrewValueGV = nullptr; // __threwValue (Emscripten)
302
Function *GetTempRet0F = nullptr; // getTempRet0() (Emscripten)
303
Function *SetTempRet0F = nullptr; // setTempRet0() (Emscripten)
304
Function *ResumeF = nullptr; // __resumeException() (Emscripten)
305
Function *EHTypeIDF = nullptr; // llvm.eh.typeid.for() (intrinsic)
306
Function *EmLongjmpF = nullptr; // emscripten_longjmp() (Emscripten)
307
Function *WasmSetjmpF = nullptr; // __wasm_setjmp() (Emscripten)
308
Function *WasmSetjmpTestF = nullptr; // __wasm_setjmp_test() (Emscripten)
309
Function *WasmLongjmpF = nullptr; // __wasm_longjmp() (Emscripten)
310
Function *CatchF = nullptr; // wasm.catch() (intrinsic)
311
312
// type of 'struct __WasmLongjmpArgs' defined in emscripten
313
Type *LongjmpArgsTy = nullptr;
314
315
// __cxa_find_matching_catch_N functions.
316
// Indexed by the number of clauses in an original landingpad instruction.
317
DenseMap<int, Function *> FindMatchingCatches;
318
// Map of <function signature string, invoke_ wrappers>
319
StringMap<Function *> InvokeWrappers;
320
// Set of allowed function names for exception handling
321
std::set<std::string> EHAllowlistSet;
322
// Functions that contains calls to setjmp
323
SmallPtrSet<Function *, 8> SetjmpUsers;
324
325
StringRef getPassName() const override {
326
return "WebAssembly Lower Emscripten Exceptions";
327
}
328
329
using InstVector = SmallVectorImpl<Instruction *>;
330
bool runEHOnFunction(Function &F);
331
bool runSjLjOnFunction(Function &F);
332
void handleLongjmpableCallsForEmscriptenSjLj(
333
Function &F, Instruction *FunctionInvocationId,
334
SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
335
void
336
handleLongjmpableCallsForWasmSjLj(Function &F,
337
Instruction *FunctionInvocationId,
338
SmallVectorImpl<PHINode *> &SetjmpRetPHIs);
339
Function *getFindMatchingCatch(Module &M, unsigned NumClauses);
340
341
Value *wrapInvoke(CallBase *CI);
342
void wrapTestSetjmp(BasicBlock *BB, DebugLoc DL, Value *Threw,
343
Value *FunctionInvocationId, Value *&Label,
344
Value *&LongjmpResult, BasicBlock *&CallEmLongjmpBB,
345
PHINode *&CallEmLongjmpBBThrewPHI,
346
PHINode *&CallEmLongjmpBBThrewValuePHI,
347
BasicBlock *&EndBB);
348
Function *getInvokeWrapper(CallBase *CI);
349
350
bool areAllExceptionsAllowed() const { return EHAllowlistSet.empty(); }
351
bool supportsException(const Function *F) const {
352
return EnableEmEH && (areAllExceptionsAllowed() ||
353
EHAllowlistSet.count(std::string(F->getName())));
354
}
355
void replaceLongjmpWith(Function *LongjmpF, Function *NewF);
356
357
void rebuildSSA(Function &F);
358
359
public:
360
static char ID;
361
362
WebAssemblyLowerEmscriptenEHSjLj()
363
: ModulePass(ID), EnableEmEH(WebAssembly::WasmEnableEmEH),
364
EnableEmSjLj(WebAssembly::WasmEnableEmSjLj),
365
EnableWasmSjLj(WebAssembly::WasmEnableSjLj) {
366
assert(!(EnableEmSjLj && EnableWasmSjLj) &&
367
"Two SjLj modes cannot be turned on at the same time");
368
assert(!(EnableEmEH && EnableWasmSjLj) &&
369
"Wasm SjLj should be only used with Wasm EH");
370
EHAllowlistSet.insert(EHAllowlist.begin(), EHAllowlist.end());
371
}
372
bool runOnModule(Module &M) override;
373
374
void getAnalysisUsage(AnalysisUsage &AU) const override {
375
AU.addRequired<DominatorTreeWrapperPass>();
376
}
377
};
378
} // End anonymous namespace
379
380
char WebAssemblyLowerEmscriptenEHSjLj::ID = 0;
381
INITIALIZE_PASS(WebAssemblyLowerEmscriptenEHSjLj, DEBUG_TYPE,
382
"WebAssembly Lower Emscripten Exceptions / Setjmp / Longjmp",
383
false, false)
384
385
ModulePass *llvm::createWebAssemblyLowerEmscriptenEHSjLj() {
386
return new WebAssemblyLowerEmscriptenEHSjLj();
387
}
388
389
static bool canThrow(const Value *V) {
390
if (const auto *F = dyn_cast<const Function>(V)) {
391
// Intrinsics cannot throw
392
if (F->isIntrinsic())
393
return false;
394
StringRef Name = F->getName();
395
// leave setjmp and longjmp (mostly) alone, we process them properly later
396
if (Name == "setjmp" || Name == "longjmp" || Name == "emscripten_longjmp")
397
return false;
398
return !F->doesNotThrow();
399
}
400
// not a function, so an indirect call - can throw, we can't tell
401
return true;
402
}
403
404
// Get a thread-local global variable with the given name. If it doesn't exist
405
// declare it, which will generate an import and assume that it will exist at
406
// link time.
407
static GlobalVariable *getGlobalVariable(Module &M, Type *Ty,
408
WebAssemblyTargetMachine &TM,
409
const char *Name) {
410
auto *GV = dyn_cast<GlobalVariable>(M.getOrInsertGlobal(Name, Ty));
411
if (!GV)
412
report_fatal_error(Twine("unable to create global: ") + Name);
413
414
// Variables created by this function are thread local. If the target does not
415
// support TLS, we depend on CoalesceFeaturesAndStripAtomics to downgrade it
416
// to non-thread-local ones, in which case we don't allow this object to be
417
// linked with other objects using shared memory.
418
GV->setThreadLocalMode(GlobalValue::GeneralDynamicTLSModel);
419
return GV;
420
}
421
422
// Simple function name mangler.
423
// This function simply takes LLVM's string representation of parameter types
424
// and concatenate them with '_'. There are non-alphanumeric characters but llc
425
// is ok with it, and we need to postprocess these names after the lowering
426
// phase anyway.
427
static std::string getSignature(FunctionType *FTy) {
428
std::string Sig;
429
raw_string_ostream OS(Sig);
430
OS << *FTy->getReturnType();
431
for (Type *ParamTy : FTy->params())
432
OS << "_" << *ParamTy;
433
if (FTy->isVarArg())
434
OS << "_...";
435
Sig = OS.str();
436
erase_if(Sig, isSpace);
437
// When s2wasm parses .s file, a comma means the end of an argument. So a
438
// mangled function name can contain any character but a comma.
439
std::replace(Sig.begin(), Sig.end(), ',', '.');
440
return Sig;
441
}
442
443
static Function *getEmscriptenFunction(FunctionType *Ty, const Twine &Name,
444
Module *M) {
445
Function* F = Function::Create(Ty, GlobalValue::ExternalLinkage, Name, M);
446
// Tell the linker that this function is expected to be imported from the
447
// 'env' module.
448
if (!F->hasFnAttribute("wasm-import-module")) {
449
llvm::AttrBuilder B(M->getContext());
450
B.addAttribute("wasm-import-module", "env");
451
F->addFnAttrs(B);
452
}
453
if (!F->hasFnAttribute("wasm-import-name")) {
454
llvm::AttrBuilder B(M->getContext());
455
B.addAttribute("wasm-import-name", F->getName());
456
F->addFnAttrs(B);
457
}
458
return F;
459
}
460
461
// Returns an integer type for the target architecture's address space.
462
// i32 for wasm32 and i64 for wasm64.
463
static Type *getAddrIntType(Module *M) {
464
IRBuilder<> IRB(M->getContext());
465
return IRB.getIntNTy(M->getDataLayout().getPointerSizeInBits());
466
}
467
468
// Returns an integer pointer type for the target architecture's address space.
469
// i32* for wasm32 and i64* for wasm64. With opaque pointers this is just a ptr
470
// in address space zero.
471
static Type *getAddrPtrType(Module *M) {
472
return PointerType::getUnqual(M->getContext());
473
}
474
475
// Returns an integer whose type is the integer type for the target's address
476
// space. Returns (i32 C) for wasm32 and (i64 C) for wasm64, when C is the
477
// integer.
478
static Value *getAddrSizeInt(Module *M, uint64_t C) {
479
IRBuilder<> IRB(M->getContext());
480
return IRB.getIntN(M->getDataLayout().getPointerSizeInBits(), C);
481
}
482
483
// Returns __cxa_find_matching_catch_N function, where N = NumClauses + 2.
484
// This is because a landingpad instruction contains two more arguments, a
485
// personality function and a cleanup bit, and __cxa_find_matching_catch_N
486
// functions are named after the number of arguments in the original landingpad
487
// instruction.
488
Function *
489
WebAssemblyLowerEmscriptenEHSjLj::getFindMatchingCatch(Module &M,
490
unsigned NumClauses) {
491
if (FindMatchingCatches.count(NumClauses))
492
return FindMatchingCatches[NumClauses];
493
PointerType *Int8PtrTy = PointerType::getUnqual(M.getContext());
494
SmallVector<Type *, 16> Args(NumClauses, Int8PtrTy);
495
FunctionType *FTy = FunctionType::get(Int8PtrTy, Args, false);
496
Function *F = getEmscriptenFunction(
497
FTy, "__cxa_find_matching_catch_" + Twine(NumClauses + 2), &M);
498
FindMatchingCatches[NumClauses] = F;
499
return F;
500
}
501
502
// Generate invoke wrapper seqence with preamble and postamble
503
// Preamble:
504
// __THREW__ = 0;
505
// Postamble:
506
// %__THREW__.val = __THREW__; __THREW__ = 0;
507
// Returns %__THREW__.val, which indicates whether an exception is thrown (or
508
// whether longjmp occurred), for future use.
509
Value *WebAssemblyLowerEmscriptenEHSjLj::wrapInvoke(CallBase *CI) {
510
Module *M = CI->getModule();
511
LLVMContext &C = M->getContext();
512
513
IRBuilder<> IRB(C);
514
IRB.SetInsertPoint(CI);
515
516
// Pre-invoke
517
// __THREW__ = 0;
518
IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
519
520
// Invoke function wrapper in JavaScript
521
SmallVector<Value *, 16> Args;
522
// Put the pointer to the callee as first argument, so it can be called
523
// within the invoke wrapper later
524
Args.push_back(CI->getCalledOperand());
525
Args.append(CI->arg_begin(), CI->arg_end());
526
CallInst *NewCall = IRB.CreateCall(getInvokeWrapper(CI), Args);
527
NewCall->takeName(CI);
528
NewCall->setCallingConv(CallingConv::WASM_EmscriptenInvoke);
529
NewCall->setDebugLoc(CI->getDebugLoc());
530
531
// Because we added the pointer to the callee as first argument, all
532
// argument attribute indices have to be incremented by one.
533
SmallVector<AttributeSet, 8> ArgAttributes;
534
const AttributeList &InvokeAL = CI->getAttributes();
535
536
// No attributes for the callee pointer.
537
ArgAttributes.push_back(AttributeSet());
538
// Copy the argument attributes from the original
539
for (unsigned I = 0, E = CI->arg_size(); I < E; ++I)
540
ArgAttributes.push_back(InvokeAL.getParamAttrs(I));
541
542
AttrBuilder FnAttrs(CI->getContext(), InvokeAL.getFnAttrs());
543
if (auto Args = FnAttrs.getAllocSizeArgs()) {
544
// The allocsize attribute (if any) referes to parameters by index and needs
545
// to be adjusted.
546
auto [SizeArg, NEltArg] = *Args;
547
SizeArg += 1;
548
if (NEltArg)
549
NEltArg = *NEltArg + 1;
550
FnAttrs.addAllocSizeAttr(SizeArg, NEltArg);
551
}
552
// In case the callee has 'noreturn' attribute, We need to remove it, because
553
// we expect invoke wrappers to return.
554
FnAttrs.removeAttribute(Attribute::NoReturn);
555
556
// Reconstruct the AttributesList based on the vector we constructed.
557
AttributeList NewCallAL = AttributeList::get(
558
C, AttributeSet::get(C, FnAttrs), InvokeAL.getRetAttrs(), ArgAttributes);
559
NewCall->setAttributes(NewCallAL);
560
561
CI->replaceAllUsesWith(NewCall);
562
563
// Post-invoke
564
// %__THREW__.val = __THREW__; __THREW__ = 0;
565
Value *Threw =
566
IRB.CreateLoad(getAddrIntType(M), ThrewGV, ThrewGV->getName() + ".val");
567
IRB.CreateStore(getAddrSizeInt(M, 0), ThrewGV);
568
return Threw;
569
}
570
571
// Get matching invoke wrapper based on callee signature
572
Function *WebAssemblyLowerEmscriptenEHSjLj::getInvokeWrapper(CallBase *CI) {
573
Module *M = CI->getModule();
574
SmallVector<Type *, 16> ArgTys;
575
FunctionType *CalleeFTy = CI->getFunctionType();
576
577
std::string Sig = getSignature(CalleeFTy);
578
if (InvokeWrappers.contains(Sig))
579
return InvokeWrappers[Sig];
580
581
// Put the pointer to the callee as first argument
582
ArgTys.push_back(PointerType::getUnqual(CalleeFTy));
583
// Add argument types
584
ArgTys.append(CalleeFTy->param_begin(), CalleeFTy->param_end());
585
586
FunctionType *FTy = FunctionType::get(CalleeFTy->getReturnType(), ArgTys,
587
CalleeFTy->isVarArg());
588
Function *F = getEmscriptenFunction(FTy, "__invoke_" + Sig, M);
589
InvokeWrappers[Sig] = F;
590
return F;
591
}
592
593
static bool canLongjmp(const Value *Callee) {
594
if (auto *CalleeF = dyn_cast<Function>(Callee))
595
if (CalleeF->isIntrinsic())
596
return false;
597
598
// Attempting to transform inline assembly will result in something like:
599
// call void @__invoke_void(void ()* asm ...)
600
// which is invalid because inline assembly blocks do not have addresses
601
// and can't be passed by pointer. The result is a crash with illegal IR.
602
if (isa<InlineAsm>(Callee))
603
return false;
604
StringRef CalleeName = Callee->getName();
605
606
// TODO Include more functions or consider checking with mangled prefixes
607
608
// The reason we include malloc/free here is to exclude the malloc/free
609
// calls generated in setjmp prep / cleanup routines.
610
if (CalleeName == "setjmp" || CalleeName == "malloc" || CalleeName == "free")
611
return false;
612
613
// There are functions in Emscripten's JS glue code or compiler-rt
614
if (CalleeName == "__resumeException" || CalleeName == "llvm_eh_typeid_for" ||
615
CalleeName == "__wasm_setjmp" || CalleeName == "__wasm_setjmp_test" ||
616
CalleeName == "getTempRet0" || CalleeName == "setTempRet0")
617
return false;
618
619
// __cxa_find_matching_catch_N functions cannot longjmp
620
if (Callee->getName().starts_with("__cxa_find_matching_catch_"))
621
return false;
622
623
// Exception-catching related functions
624
//
625
// We intentionally treat __cxa_end_catch longjmpable in Wasm SjLj even though
626
// it surely cannot longjmp, in order to maintain the unwind relationship from
627
// all existing catchpads (and calls within them) to catch.dispatch.longjmp.
628
//
629
// In Wasm EH + Wasm SjLj, we
630
// 1. Make all catchswitch and cleanuppad that unwind to caller unwind to
631
// catch.dispatch.longjmp instead
632
// 2. Convert all longjmpable calls to invokes that unwind to
633
// catch.dispatch.longjmp
634
// But catchswitch BBs are removed in isel, so if an EH catchswitch (generated
635
// from an exception)'s catchpad does not contain any calls that are converted
636
// into invokes unwinding to catch.dispatch.longjmp, this unwind relationship
637
// (EH catchswitch BB -> catch.dispatch.longjmp BB) is lost and
638
// catch.dispatch.longjmp BB can be placed before the EH catchswitch BB in
639
// CFGSort.
640
// int ret = setjmp(buf);
641
// try {
642
// foo(); // longjmps
643
// } catch (...) {
644
// }
645
// Then in this code, if 'foo' longjmps, it first unwinds to 'catch (...)'
646
// catchswitch, and is not caught by that catchswitch because it is a longjmp,
647
// then it should next unwind to catch.dispatch.longjmp BB. But if this 'catch
648
// (...)' catchswitch -> catch.dispatch.longjmp unwind relationship is lost,
649
// it will not unwind to catch.dispatch.longjmp, producing an incorrect
650
// result.
651
//
652
// Every catchpad generated by Wasm C++ contains __cxa_end_catch, so we
653
// intentionally treat it as longjmpable to work around this problem. This is
654
// a hacky fix but an easy one.
655
//
656
// The comment block in findWasmUnwindDestinations() in
657
// SelectionDAGBuilder.cpp is addressing a similar problem.
658
if (CalleeName == "__cxa_end_catch")
659
return WebAssembly::WasmEnableSjLj;
660
if (CalleeName == "__cxa_begin_catch" ||
661
CalleeName == "__cxa_allocate_exception" || CalleeName == "__cxa_throw" ||
662
CalleeName == "__clang_call_terminate")
663
return false;
664
665
// std::terminate, which is generated when another exception occurs while
666
// handling an exception, cannot longjmp.
667
if (CalleeName == "_ZSt9terminatev")
668
return false;
669
670
// Otherwise we don't know
671
return true;
672
}
673
674
static bool isEmAsmCall(const Value *Callee) {
675
StringRef CalleeName = Callee->getName();
676
// This is an exhaustive list from Emscripten's <emscripten/em_asm.h>.
677
return CalleeName == "emscripten_asm_const_int" ||
678
CalleeName == "emscripten_asm_const_double" ||
679
CalleeName == "emscripten_asm_const_int_sync_on_main_thread" ||
680
CalleeName == "emscripten_asm_const_double_sync_on_main_thread" ||
681
CalleeName == "emscripten_asm_const_async_on_main_thread";
682
}
683
684
// Generate __wasm_setjmp_test function call seqence with preamble and
685
// postamble. The code this generates is equivalent to the following
686
// JavaScript code:
687
// %__threwValue.val = __threwValue;
688
// if (%__THREW__.val != 0 & %__threwValue.val != 0) {
689
// %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
690
// if (%label == 0)
691
// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
692
// setTempRet0(%__threwValue.val);
693
// } else {
694
// %label = -1;
695
// }
696
// %longjmp_result = getTempRet0();
697
//
698
// As output parameters. returns %label, %longjmp_result, and the BB the last
699
// instruction (%longjmp_result = ...) is in.
700
void WebAssemblyLowerEmscriptenEHSjLj::wrapTestSetjmp(
701
BasicBlock *BB, DebugLoc DL, Value *Threw, Value *FunctionInvocationId,
702
Value *&Label, Value *&LongjmpResult, BasicBlock *&CallEmLongjmpBB,
703
PHINode *&CallEmLongjmpBBThrewPHI, PHINode *&CallEmLongjmpBBThrewValuePHI,
704
BasicBlock *&EndBB) {
705
Function *F = BB->getParent();
706
Module *M = F->getParent();
707
LLVMContext &C = M->getContext();
708
IRBuilder<> IRB(C);
709
IRB.SetCurrentDebugLocation(DL);
710
711
// if (%__THREW__.val != 0 & %__threwValue.val != 0)
712
IRB.SetInsertPoint(BB);
713
BasicBlock *ThenBB1 = BasicBlock::Create(C, "if.then1", F);
714
BasicBlock *ElseBB1 = BasicBlock::Create(C, "if.else1", F);
715
BasicBlock *EndBB1 = BasicBlock::Create(C, "if.end", F);
716
Value *ThrewCmp = IRB.CreateICmpNE(Threw, getAddrSizeInt(M, 0));
717
Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
718
ThrewValueGV->getName() + ".val");
719
Value *ThrewValueCmp = IRB.CreateICmpNE(ThrewValue, IRB.getInt32(0));
720
Value *Cmp1 = IRB.CreateAnd(ThrewCmp, ThrewValueCmp, "cmp1");
721
IRB.CreateCondBr(Cmp1, ThenBB1, ElseBB1);
722
723
// Generate call.em.longjmp BB once and share it within the function
724
if (!CallEmLongjmpBB) {
725
// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
726
CallEmLongjmpBB = BasicBlock::Create(C, "call.em.longjmp", F);
727
IRB.SetInsertPoint(CallEmLongjmpBB);
728
CallEmLongjmpBBThrewPHI = IRB.CreatePHI(getAddrIntType(M), 4, "threw.phi");
729
CallEmLongjmpBBThrewValuePHI =
730
IRB.CreatePHI(IRB.getInt32Ty(), 4, "threwvalue.phi");
731
CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
732
CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
733
IRB.CreateCall(EmLongjmpF,
734
{CallEmLongjmpBBThrewPHI, CallEmLongjmpBBThrewValuePHI});
735
IRB.CreateUnreachable();
736
} else {
737
CallEmLongjmpBBThrewPHI->addIncoming(Threw, ThenBB1);
738
CallEmLongjmpBBThrewValuePHI->addIncoming(ThrewValue, ThenBB1);
739
}
740
741
// %label = __wasm_setjmp_test(%__THREW__.val, functionInvocationId);
742
// if (%label == 0)
743
IRB.SetInsertPoint(ThenBB1);
744
BasicBlock *EndBB2 = BasicBlock::Create(C, "if.end2", F);
745
Value *ThrewPtr =
746
IRB.CreateIntToPtr(Threw, getAddrPtrType(M), Threw->getName() + ".p");
747
Value *ThenLabel = IRB.CreateCall(WasmSetjmpTestF,
748
{ThrewPtr, FunctionInvocationId}, "label");
749
Value *Cmp2 = IRB.CreateICmpEQ(ThenLabel, IRB.getInt32(0));
750
IRB.CreateCondBr(Cmp2, CallEmLongjmpBB, EndBB2);
751
752
// setTempRet0(%__threwValue.val);
753
IRB.SetInsertPoint(EndBB2);
754
IRB.CreateCall(SetTempRet0F, ThrewValue);
755
IRB.CreateBr(EndBB1);
756
757
IRB.SetInsertPoint(ElseBB1);
758
IRB.CreateBr(EndBB1);
759
760
// longjmp_result = getTempRet0();
761
IRB.SetInsertPoint(EndBB1);
762
PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label");
763
LabelPHI->addIncoming(ThenLabel, EndBB2);
764
765
LabelPHI->addIncoming(IRB.getInt32(-1), ElseBB1);
766
767
// Output parameter assignment
768
Label = LabelPHI;
769
EndBB = EndBB1;
770
LongjmpResult = IRB.CreateCall(GetTempRet0F, std::nullopt, "longjmp_result");
771
}
772
773
void WebAssemblyLowerEmscriptenEHSjLj::rebuildSSA(Function &F) {
774
DominatorTree &DT = getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
775
DT.recalculate(F); // CFG has been changed
776
777
SSAUpdaterBulk SSA;
778
for (BasicBlock &BB : F) {
779
for (Instruction &I : BB) {
780
unsigned VarID = SSA.AddVariable(I.getName(), I.getType());
781
// If a value is defined by an invoke instruction, it is only available in
782
// its normal destination and not in its unwind destination.
783
if (auto *II = dyn_cast<InvokeInst>(&I))
784
SSA.AddAvailableValue(VarID, II->getNormalDest(), II);
785
else
786
SSA.AddAvailableValue(VarID, &BB, &I);
787
for (auto &U : I.uses()) {
788
auto *User = cast<Instruction>(U.getUser());
789
if (auto *UserPN = dyn_cast<PHINode>(User))
790
if (UserPN->getIncomingBlock(U) == &BB)
791
continue;
792
if (DT.dominates(&I, User))
793
continue;
794
SSA.AddUse(VarID, &U);
795
}
796
}
797
}
798
SSA.RewriteAllUses(&DT);
799
}
800
801
// Replace uses of longjmp with a new longjmp function in Emscripten library.
802
// In Emscripten SjLj, the new function is
803
// void emscripten_longjmp(uintptr_t, i32)
804
// In Wasm SjLj, the new function is
805
// void __wasm_longjmp(i8*, i32)
806
// Because the original libc longjmp function takes (jmp_buf*, i32), we need a
807
// ptrtoint/bitcast instruction here to make the type match. jmp_buf* will
808
// eventually be lowered to i32/i64 in the wasm backend.
809
void WebAssemblyLowerEmscriptenEHSjLj::replaceLongjmpWith(Function *LongjmpF,
810
Function *NewF) {
811
assert(NewF == EmLongjmpF || NewF == WasmLongjmpF);
812
Module *M = LongjmpF->getParent();
813
SmallVector<CallInst *, 8> ToErase;
814
LLVMContext &C = LongjmpF->getParent()->getContext();
815
IRBuilder<> IRB(C);
816
817
// For calls to longjmp, replace it with emscripten_longjmp/__wasm_longjmp and
818
// cast its first argument (jmp_buf*) appropriately
819
for (User *U : LongjmpF->users()) {
820
auto *CI = dyn_cast<CallInst>(U);
821
if (CI && CI->getCalledFunction() == LongjmpF) {
822
IRB.SetInsertPoint(CI);
823
Value *Env = nullptr;
824
if (NewF == EmLongjmpF)
825
Env =
826
IRB.CreatePtrToInt(CI->getArgOperand(0), getAddrIntType(M), "env");
827
else // WasmLongjmpF
828
Env = IRB.CreateBitCast(CI->getArgOperand(0), IRB.getPtrTy(), "env");
829
IRB.CreateCall(NewF, {Env, CI->getArgOperand(1)});
830
ToErase.push_back(CI);
831
}
832
}
833
for (auto *I : ToErase)
834
I->eraseFromParent();
835
836
// If we have any remaining uses of longjmp's function pointer, replace it
837
// with (void(*)(jmp_buf*, int))emscripten_longjmp / __wasm_longjmp.
838
if (!LongjmpF->uses().empty()) {
839
Value *NewLongjmp =
840
IRB.CreateBitCast(NewF, LongjmpF->getType(), "longjmp.cast");
841
LongjmpF->replaceAllUsesWith(NewLongjmp);
842
}
843
}
844
845
static bool containsLongjmpableCalls(const Function *F) {
846
for (const auto &BB : *F)
847
for (const auto &I : BB)
848
if (const auto *CB = dyn_cast<CallBase>(&I))
849
if (canLongjmp(CB->getCalledOperand()))
850
return true;
851
return false;
852
}
853
854
// When a function contains a setjmp call but not other calls that can longjmp,
855
// we don't do setjmp transformation for that setjmp. But we need to convert the
856
// setjmp calls into "i32 0" so they don't cause link time errors. setjmp always
857
// returns 0 when called directly.
858
static void nullifySetjmp(Function *F) {
859
Module &M = *F->getParent();
860
IRBuilder<> IRB(M.getContext());
861
Function *SetjmpF = M.getFunction("setjmp");
862
SmallVector<Instruction *, 1> ToErase;
863
864
for (User *U : make_early_inc_range(SetjmpF->users())) {
865
auto *CB = cast<CallBase>(U);
866
BasicBlock *BB = CB->getParent();
867
if (BB->getParent() != F) // in other function
868
continue;
869
CallInst *CI = nullptr;
870
// setjmp cannot throw. So if it is an invoke, lower it to a call
871
if (auto *II = dyn_cast<InvokeInst>(CB))
872
CI = llvm::changeToCall(II);
873
else
874
CI = cast<CallInst>(CB);
875
ToErase.push_back(CI);
876
CI->replaceAllUsesWith(IRB.getInt32(0));
877
}
878
for (auto *I : ToErase)
879
I->eraseFromParent();
880
}
881
882
bool WebAssemblyLowerEmscriptenEHSjLj::runOnModule(Module &M) {
883
LLVM_DEBUG(dbgs() << "********** Lower Emscripten EH & SjLj **********\n");
884
885
LLVMContext &C = M.getContext();
886
IRBuilder<> IRB(C);
887
888
Function *SetjmpF = M.getFunction("setjmp");
889
Function *LongjmpF = M.getFunction("longjmp");
890
891
// In some platforms _setjmp and _longjmp are used instead. Change these to
892
// use setjmp/longjmp instead, because we later detect these functions by
893
// their names.
894
Function *SetjmpF2 = M.getFunction("_setjmp");
895
Function *LongjmpF2 = M.getFunction("_longjmp");
896
if (SetjmpF2) {
897
if (SetjmpF) {
898
if (SetjmpF->getFunctionType() != SetjmpF2->getFunctionType())
899
report_fatal_error("setjmp and _setjmp have different function types");
900
} else {
901
SetjmpF = Function::Create(SetjmpF2->getFunctionType(),
902
GlobalValue::ExternalLinkage, "setjmp", M);
903
}
904
SetjmpF2->replaceAllUsesWith(SetjmpF);
905
}
906
if (LongjmpF2) {
907
if (LongjmpF) {
908
if (LongjmpF->getFunctionType() != LongjmpF2->getFunctionType())
909
report_fatal_error(
910
"longjmp and _longjmp have different function types");
911
} else {
912
LongjmpF = Function::Create(LongjmpF2->getFunctionType(),
913
GlobalValue::ExternalLinkage, "setjmp", M);
914
}
915
LongjmpF2->replaceAllUsesWith(LongjmpF);
916
}
917
918
auto *TPC = getAnalysisIfAvailable<TargetPassConfig>();
919
assert(TPC && "Expected a TargetPassConfig");
920
auto &TM = TPC->getTM<WebAssemblyTargetMachine>();
921
922
// Declare (or get) global variables __THREW__, __threwValue, and
923
// getTempRet0/setTempRet0 function which are used in common for both
924
// exception handling and setjmp/longjmp handling
925
ThrewGV = getGlobalVariable(M, getAddrIntType(&M), TM, "__THREW__");
926
ThrewValueGV = getGlobalVariable(M, IRB.getInt32Ty(), TM, "__threwValue");
927
GetTempRet0F = getEmscriptenFunction(
928
FunctionType::get(IRB.getInt32Ty(), false), "getTempRet0", &M);
929
SetTempRet0F = getEmscriptenFunction(
930
FunctionType::get(IRB.getVoidTy(), IRB.getInt32Ty(), false),
931
"setTempRet0", &M);
932
GetTempRet0F->setDoesNotThrow();
933
SetTempRet0F->setDoesNotThrow();
934
935
bool Changed = false;
936
937
// Function registration for exception handling
938
if (EnableEmEH) {
939
// Register __resumeException function
940
FunctionType *ResumeFTy =
941
FunctionType::get(IRB.getVoidTy(), IRB.getPtrTy(), false);
942
ResumeF = getEmscriptenFunction(ResumeFTy, "__resumeException", &M);
943
ResumeF->addFnAttr(Attribute::NoReturn);
944
945
// Register llvm_eh_typeid_for function
946
FunctionType *EHTypeIDTy =
947
FunctionType::get(IRB.getInt32Ty(), IRB.getPtrTy(), false);
948
EHTypeIDF = getEmscriptenFunction(EHTypeIDTy, "llvm_eh_typeid_for", &M);
949
}
950
951
// Functions that contains calls to setjmp but don't have other longjmpable
952
// calls within them.
953
SmallPtrSet<Function *, 4> SetjmpUsersToNullify;
954
955
if ((EnableEmSjLj || EnableWasmSjLj) && SetjmpF) {
956
// Precompute setjmp users
957
for (User *U : SetjmpF->users()) {
958
if (auto *CB = dyn_cast<CallBase>(U)) {
959
auto *UserF = CB->getFunction();
960
// If a function that calls setjmp does not contain any other calls that
961
// can longjmp, we don't need to do any transformation on that function,
962
// so can ignore it
963
if (containsLongjmpableCalls(UserF))
964
SetjmpUsers.insert(UserF);
965
else
966
SetjmpUsersToNullify.insert(UserF);
967
} else {
968
std::string S;
969
raw_string_ostream SS(S);
970
SS << *U;
971
report_fatal_error(Twine("Indirect use of setjmp is not supported: ") +
972
SS.str());
973
}
974
}
975
}
976
977
bool SetjmpUsed = SetjmpF && !SetjmpUsers.empty();
978
bool LongjmpUsed = LongjmpF && !LongjmpF->use_empty();
979
DoSjLj = (EnableEmSjLj | EnableWasmSjLj) && (SetjmpUsed || LongjmpUsed);
980
981
// Function registration and data pre-gathering for setjmp/longjmp handling
982
if (DoSjLj) {
983
assert(EnableEmSjLj || EnableWasmSjLj);
984
if (EnableEmSjLj) {
985
// Register emscripten_longjmp function
986
FunctionType *FTy = FunctionType::get(
987
IRB.getVoidTy(), {getAddrIntType(&M), IRB.getInt32Ty()}, false);
988
EmLongjmpF = getEmscriptenFunction(FTy, "emscripten_longjmp", &M);
989
EmLongjmpF->addFnAttr(Attribute::NoReturn);
990
} else { // EnableWasmSjLj
991
Type *Int8PtrTy = IRB.getPtrTy();
992
// Register __wasm_longjmp function, which calls __builtin_wasm_longjmp.
993
FunctionType *FTy = FunctionType::get(
994
IRB.getVoidTy(), {Int8PtrTy, IRB.getInt32Ty()}, false);
995
WasmLongjmpF = getEmscriptenFunction(FTy, "__wasm_longjmp", &M);
996
WasmLongjmpF->addFnAttr(Attribute::NoReturn);
997
}
998
999
if (SetjmpF) {
1000
Type *Int8PtrTy = IRB.getPtrTy();
1001
Type *Int32PtrTy = IRB.getPtrTy();
1002
Type *Int32Ty = IRB.getInt32Ty();
1003
1004
// Register __wasm_setjmp function
1005
FunctionType *SetjmpFTy = SetjmpF->getFunctionType();
1006
FunctionType *FTy = FunctionType::get(
1007
IRB.getVoidTy(), {SetjmpFTy->getParamType(0), Int32Ty, Int32PtrTy},
1008
false);
1009
WasmSetjmpF = getEmscriptenFunction(FTy, "__wasm_setjmp", &M);
1010
1011
// Register __wasm_setjmp_test function
1012
FTy = FunctionType::get(Int32Ty, {Int32PtrTy, Int32PtrTy}, false);
1013
WasmSetjmpTestF = getEmscriptenFunction(FTy, "__wasm_setjmp_test", &M);
1014
1015
// wasm.catch() will be lowered down to wasm 'catch' instruction in
1016
// instruction selection.
1017
CatchF = Intrinsic::getDeclaration(&M, Intrinsic::wasm_catch);
1018
// Type for struct __WasmLongjmpArgs
1019
LongjmpArgsTy = StructType::get(Int8PtrTy, // env
1020
Int32Ty // val
1021
);
1022
}
1023
}
1024
1025
// Exception handling transformation
1026
if (EnableEmEH) {
1027
for (Function &F : M) {
1028
if (F.isDeclaration())
1029
continue;
1030
Changed |= runEHOnFunction(F);
1031
}
1032
}
1033
1034
// Setjmp/longjmp handling transformation
1035
if (DoSjLj) {
1036
Changed = true; // We have setjmp or longjmp somewhere
1037
if (LongjmpF)
1038
replaceLongjmpWith(LongjmpF, EnableEmSjLj ? EmLongjmpF : WasmLongjmpF);
1039
// Only traverse functions that uses setjmp in order not to insert
1040
// unnecessary prep / cleanup code in every function
1041
if (SetjmpF)
1042
for (Function *F : SetjmpUsers)
1043
runSjLjOnFunction(*F);
1044
}
1045
1046
// Replace unnecessary setjmp calls with 0
1047
if ((EnableEmSjLj || EnableWasmSjLj) && !SetjmpUsersToNullify.empty()) {
1048
Changed = true;
1049
assert(SetjmpF);
1050
for (Function *F : SetjmpUsersToNullify)
1051
nullifySetjmp(F);
1052
}
1053
1054
// Delete unused global variables and functions
1055
for (auto *V : {ThrewGV, ThrewValueGV})
1056
if (V && V->use_empty())
1057
V->eraseFromParent();
1058
for (auto *V : {GetTempRet0F, SetTempRet0F, ResumeF, EHTypeIDF, EmLongjmpF,
1059
WasmSetjmpF, WasmSetjmpTestF, WasmLongjmpF, CatchF})
1060
if (V && V->use_empty())
1061
V->eraseFromParent();
1062
1063
return Changed;
1064
}
1065
1066
bool WebAssemblyLowerEmscriptenEHSjLj::runEHOnFunction(Function &F) {
1067
Module &M = *F.getParent();
1068
LLVMContext &C = F.getContext();
1069
IRBuilder<> IRB(C);
1070
bool Changed = false;
1071
SmallVector<Instruction *, 64> ToErase;
1072
SmallPtrSet<LandingPadInst *, 32> LandingPads;
1073
1074
// rethrow.longjmp BB that will be shared within the function.
1075
BasicBlock *RethrowLongjmpBB = nullptr;
1076
// PHI node for the loaded value of __THREW__ global variable in
1077
// rethrow.longjmp BB
1078
PHINode *RethrowLongjmpBBThrewPHI = nullptr;
1079
1080
for (BasicBlock &BB : F) {
1081
auto *II = dyn_cast<InvokeInst>(BB.getTerminator());
1082
if (!II)
1083
continue;
1084
Changed = true;
1085
LandingPads.insert(II->getLandingPadInst());
1086
IRB.SetInsertPoint(II);
1087
1088
const Value *Callee = II->getCalledOperand();
1089
bool NeedInvoke = supportsException(&F) && canThrow(Callee);
1090
if (NeedInvoke) {
1091
// Wrap invoke with invoke wrapper and generate preamble/postamble
1092
Value *Threw = wrapInvoke(II);
1093
ToErase.push_back(II);
1094
1095
// If setjmp/longjmp handling is enabled, the thrown value can be not an
1096
// exception but a longjmp. If the current function contains calls to
1097
// setjmp, it will be appropriately handled in runSjLjOnFunction. But even
1098
// if the function does not contain setjmp calls, we shouldn't silently
1099
// ignore longjmps; we should rethrow them so they can be correctly
1100
// handled in somewhere up the call chain where setjmp is. __THREW__'s
1101
// value is 0 when nothing happened, 1 when an exception is thrown, and
1102
// other values when longjmp is thrown.
1103
//
1104
// if (%__THREW__.val == 0 || %__THREW__.val == 1)
1105
// goto %tail
1106
// else
1107
// goto %longjmp.rethrow
1108
//
1109
// rethrow.longjmp: ;; This is longjmp. Rethrow it
1110
// %__threwValue.val = __threwValue
1111
// emscripten_longjmp(%__THREW__.val, %__threwValue.val);
1112
//
1113
// tail: ;; Nothing happened or an exception is thrown
1114
// ... Continue exception handling ...
1115
if (DoSjLj && EnableEmSjLj && !SetjmpUsers.count(&F) &&
1116
canLongjmp(Callee)) {
1117
// Create longjmp.rethrow BB once and share it within the function
1118
if (!RethrowLongjmpBB) {
1119
RethrowLongjmpBB = BasicBlock::Create(C, "rethrow.longjmp", &F);
1120
IRB.SetInsertPoint(RethrowLongjmpBB);
1121
RethrowLongjmpBBThrewPHI =
1122
IRB.CreatePHI(getAddrIntType(&M), 4, "threw.phi");
1123
RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1124
Value *ThrewValue = IRB.CreateLoad(IRB.getInt32Ty(), ThrewValueGV,
1125
ThrewValueGV->getName() + ".val");
1126
IRB.CreateCall(EmLongjmpF, {RethrowLongjmpBBThrewPHI, ThrewValue});
1127
IRB.CreateUnreachable();
1128
} else {
1129
RethrowLongjmpBBThrewPHI->addIncoming(Threw, &BB);
1130
}
1131
1132
IRB.SetInsertPoint(II); // Restore the insert point back
1133
BasicBlock *Tail = BasicBlock::Create(C, "tail", &F);
1134
Value *CmpEqOne =
1135
IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1136
Value *CmpEqZero =
1137
IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 0), "cmp.eq.zero");
1138
Value *Or = IRB.CreateOr(CmpEqZero, CmpEqOne, "or");
1139
IRB.CreateCondBr(Or, Tail, RethrowLongjmpBB);
1140
IRB.SetInsertPoint(Tail);
1141
BB.replaceSuccessorsPhiUsesWith(&BB, Tail);
1142
}
1143
1144
// Insert a branch based on __THREW__ variable
1145
Value *Cmp = IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp");
1146
IRB.CreateCondBr(Cmp, II->getUnwindDest(), II->getNormalDest());
1147
1148
} else {
1149
// This can't throw, and we don't need this invoke, just replace it with a
1150
// call+branch
1151
changeToCall(II);
1152
}
1153
}
1154
1155
// Process resume instructions
1156
for (BasicBlock &BB : F) {
1157
// Scan the body of the basic block for resumes
1158
for (Instruction &I : BB) {
1159
auto *RI = dyn_cast<ResumeInst>(&I);
1160
if (!RI)
1161
continue;
1162
Changed = true;
1163
1164
// Split the input into legal values
1165
Value *Input = RI->getValue();
1166
IRB.SetInsertPoint(RI);
1167
Value *Low = IRB.CreateExtractValue(Input, 0, "low");
1168
// Create a call to __resumeException function
1169
IRB.CreateCall(ResumeF, {Low});
1170
// Add a terminator to the block
1171
IRB.CreateUnreachable();
1172
ToErase.push_back(RI);
1173
}
1174
}
1175
1176
// Process llvm.eh.typeid.for intrinsics
1177
for (BasicBlock &BB : F) {
1178
for (Instruction &I : BB) {
1179
auto *CI = dyn_cast<CallInst>(&I);
1180
if (!CI)
1181
continue;
1182
const Function *Callee = CI->getCalledFunction();
1183
if (!Callee)
1184
continue;
1185
if (Callee->getIntrinsicID() != Intrinsic::eh_typeid_for)
1186
continue;
1187
Changed = true;
1188
1189
IRB.SetInsertPoint(CI);
1190
CallInst *NewCI =
1191
IRB.CreateCall(EHTypeIDF, CI->getArgOperand(0), "typeid");
1192
CI->replaceAllUsesWith(NewCI);
1193
ToErase.push_back(CI);
1194
}
1195
}
1196
1197
// Look for orphan landingpads, can occur in blocks with no predecessors
1198
for (BasicBlock &BB : F) {
1199
Instruction *I = BB.getFirstNonPHI();
1200
if (auto *LPI = dyn_cast<LandingPadInst>(I))
1201
LandingPads.insert(LPI);
1202
}
1203
Changed |= !LandingPads.empty();
1204
1205
// Handle all the landingpad for this function together, as multiple invokes
1206
// may share a single lp
1207
for (LandingPadInst *LPI : LandingPads) {
1208
IRB.SetInsertPoint(LPI);
1209
SmallVector<Value *, 16> FMCArgs;
1210
for (unsigned I = 0, E = LPI->getNumClauses(); I < E; ++I) {
1211
Constant *Clause = LPI->getClause(I);
1212
// TODO Handle filters (= exception specifications).
1213
// https://github.com/llvm/llvm-project/issues/49740
1214
if (LPI->isCatch(I))
1215
FMCArgs.push_back(Clause);
1216
}
1217
1218
// Create a call to __cxa_find_matching_catch_N function
1219
Function *FMCF = getFindMatchingCatch(M, FMCArgs.size());
1220
CallInst *FMCI = IRB.CreateCall(FMCF, FMCArgs, "fmc");
1221
Value *Poison = PoisonValue::get(LPI->getType());
1222
Value *Pair0 = IRB.CreateInsertValue(Poison, FMCI, 0, "pair0");
1223
Value *TempRet0 = IRB.CreateCall(GetTempRet0F, std::nullopt, "tempret0");
1224
Value *Pair1 = IRB.CreateInsertValue(Pair0, TempRet0, 1, "pair1");
1225
1226
LPI->replaceAllUsesWith(Pair1);
1227
ToErase.push_back(LPI);
1228
}
1229
1230
// Erase everything we no longer need in this function
1231
for (Instruction *I : ToErase)
1232
I->eraseFromParent();
1233
1234
return Changed;
1235
}
1236
1237
// This tries to get debug info from the instruction before which a new
1238
// instruction will be inserted, and if there's no debug info in that
1239
// instruction, tries to get the info instead from the previous instruction (if
1240
// any). If none of these has debug info and a DISubprogram is provided, it
1241
// creates a dummy debug info with the first line of the function, because IR
1242
// verifier requires all inlinable callsites should have debug info when both a
1243
// caller and callee have DISubprogram. If none of these conditions are met,
1244
// returns empty info.
1245
static DebugLoc getOrCreateDebugLoc(const Instruction *InsertBefore,
1246
DISubprogram *SP) {
1247
assert(InsertBefore);
1248
if (InsertBefore->getDebugLoc())
1249
return InsertBefore->getDebugLoc();
1250
const Instruction *Prev = InsertBefore->getPrevNode();
1251
if (Prev && Prev->getDebugLoc())
1252
return Prev->getDebugLoc();
1253
if (SP)
1254
return DILocation::get(SP->getContext(), SP->getLine(), 1, SP);
1255
return DebugLoc();
1256
}
1257
1258
bool WebAssemblyLowerEmscriptenEHSjLj::runSjLjOnFunction(Function &F) {
1259
assert(EnableEmSjLj || EnableWasmSjLj);
1260
Module &M = *F.getParent();
1261
LLVMContext &C = F.getContext();
1262
IRBuilder<> IRB(C);
1263
SmallVector<Instruction *, 64> ToErase;
1264
1265
// Setjmp preparation
1266
1267
BasicBlock *Entry = &F.getEntryBlock();
1268
DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1269
SplitBlock(Entry, &*Entry->getFirstInsertionPt());
1270
1271
IRB.SetInsertPoint(Entry->getTerminator()->getIterator());
1272
// This alloca'ed pointer is used by the runtime to identify function
1273
// invocations. It's just for pointer comparisons. It will never be
1274
// dereferenced.
1275
Instruction *FunctionInvocationId =
1276
IRB.CreateAlloca(IRB.getInt32Ty(), nullptr, "functionInvocationId");
1277
FunctionInvocationId->setDebugLoc(FirstDL);
1278
1279
// Setjmp transformation
1280
SmallVector<PHINode *, 4> SetjmpRetPHIs;
1281
Function *SetjmpF = M.getFunction("setjmp");
1282
for (auto *U : make_early_inc_range(SetjmpF->users())) {
1283
auto *CB = cast<CallBase>(U);
1284
BasicBlock *BB = CB->getParent();
1285
if (BB->getParent() != &F) // in other function
1286
continue;
1287
if (CB->getOperandBundle(LLVMContext::OB_funclet)) {
1288
std::string S;
1289
raw_string_ostream SS(S);
1290
SS << "In function " + F.getName() +
1291
": setjmp within a catch clause is not supported in Wasm EH:\n";
1292
SS << *CB;
1293
report_fatal_error(StringRef(SS.str()));
1294
}
1295
1296
CallInst *CI = nullptr;
1297
// setjmp cannot throw. So if it is an invoke, lower it to a call
1298
if (auto *II = dyn_cast<InvokeInst>(CB))
1299
CI = llvm::changeToCall(II);
1300
else
1301
CI = cast<CallInst>(CB);
1302
1303
// The tail is everything right after the call, and will be reached once
1304
// when setjmp is called, and later when longjmp returns to the setjmp
1305
BasicBlock *Tail = SplitBlock(BB, CI->getNextNode());
1306
// Add a phi to the tail, which will be the output of setjmp, which
1307
// indicates if this is the first call or a longjmp back. The phi directly
1308
// uses the right value based on where we arrive from
1309
IRB.SetInsertPoint(Tail, Tail->getFirstNonPHIIt());
1310
PHINode *SetjmpRet = IRB.CreatePHI(IRB.getInt32Ty(), 2, "setjmp.ret");
1311
1312
// setjmp initial call returns 0
1313
SetjmpRet->addIncoming(IRB.getInt32(0), BB);
1314
// The proper output is now this, not the setjmp call itself
1315
CI->replaceAllUsesWith(SetjmpRet);
1316
// longjmp returns to the setjmp will add themselves to this phi
1317
SetjmpRetPHIs.push_back(SetjmpRet);
1318
1319
// Fix call target
1320
// Our index in the function is our place in the array + 1 to avoid index
1321
// 0, because index 0 means the longjmp is not ours to handle.
1322
IRB.SetInsertPoint(CI);
1323
Value *Args[] = {CI->getArgOperand(0), IRB.getInt32(SetjmpRetPHIs.size()),
1324
FunctionInvocationId};
1325
IRB.CreateCall(WasmSetjmpF, Args);
1326
ToErase.push_back(CI);
1327
}
1328
1329
// Handle longjmpable calls.
1330
if (EnableEmSjLj)
1331
handleLongjmpableCallsForEmscriptenSjLj(F, FunctionInvocationId,
1332
SetjmpRetPHIs);
1333
else // EnableWasmSjLj
1334
handleLongjmpableCallsForWasmSjLj(F, FunctionInvocationId, SetjmpRetPHIs);
1335
1336
// Erase everything we no longer need in this function
1337
for (Instruction *I : ToErase)
1338
I->eraseFromParent();
1339
1340
// Finally, our modifications to the cfg can break dominance of SSA variables.
1341
// For example, in this code,
1342
// if (x()) { .. setjmp() .. }
1343
// if (y()) { .. longjmp() .. }
1344
// We must split the longjmp block, and it can jump into the block splitted
1345
// from setjmp one. But that means that when we split the setjmp block, it's
1346
// first part no longer dominates its second part - there is a theoretically
1347
// possible control flow path where x() is false, then y() is true and we
1348
// reach the second part of the setjmp block, without ever reaching the first
1349
// part. So, we rebuild SSA form here.
1350
rebuildSSA(F);
1351
return true;
1352
}
1353
1354
// Update each call that can longjmp so it can return to the corresponding
1355
// setjmp. Refer to 4) of "Emscripten setjmp/longjmp handling" section in the
1356
// comments at top of the file for details.
1357
void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForEmscriptenSjLj(
1358
Function &F, Instruction *FunctionInvocationId,
1359
SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1360
Module &M = *F.getParent();
1361
LLVMContext &C = F.getContext();
1362
IRBuilder<> IRB(C);
1363
SmallVector<Instruction *, 64> ToErase;
1364
1365
// call.em.longjmp BB that will be shared within the function.
1366
BasicBlock *CallEmLongjmpBB = nullptr;
1367
// PHI node for the loaded value of __THREW__ global variable in
1368
// call.em.longjmp BB
1369
PHINode *CallEmLongjmpBBThrewPHI = nullptr;
1370
// PHI node for the loaded value of __threwValue global variable in
1371
// call.em.longjmp BB
1372
PHINode *CallEmLongjmpBBThrewValuePHI = nullptr;
1373
// rethrow.exn BB that will be shared within the function.
1374
BasicBlock *RethrowExnBB = nullptr;
1375
1376
// Because we are creating new BBs while processing and don't want to make
1377
// all these newly created BBs candidates again for longjmp processing, we
1378
// first make the vector of candidate BBs.
1379
std::vector<BasicBlock *> BBs;
1380
for (BasicBlock &BB : F)
1381
BBs.push_back(&BB);
1382
1383
// BBs.size() will change within the loop, so we query it every time
1384
for (unsigned I = 0; I < BBs.size(); I++) {
1385
BasicBlock *BB = BBs[I];
1386
for (Instruction &I : *BB) {
1387
if (isa<InvokeInst>(&I)) {
1388
std::string S;
1389
raw_string_ostream SS(S);
1390
SS << "In function " << F.getName()
1391
<< ": When using Wasm EH with Emscripten SjLj, there is a "
1392
"restriction that `setjmp` function call and exception cannot be "
1393
"used within the same function:\n";
1394
SS << I;
1395
report_fatal_error(StringRef(SS.str()));
1396
}
1397
auto *CI = dyn_cast<CallInst>(&I);
1398
if (!CI)
1399
continue;
1400
1401
const Value *Callee = CI->getCalledOperand();
1402
if (!canLongjmp(Callee))
1403
continue;
1404
if (isEmAsmCall(Callee))
1405
report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1406
F.getName() +
1407
". Please consider using EM_JS, or move the "
1408
"EM_ASM into another function.",
1409
false);
1410
1411
Value *Threw = nullptr;
1412
BasicBlock *Tail;
1413
if (Callee->getName().starts_with("__invoke_")) {
1414
// If invoke wrapper has already been generated for this call in
1415
// previous EH phase, search for the load instruction
1416
// %__THREW__.val = __THREW__;
1417
// in postamble after the invoke wrapper call
1418
LoadInst *ThrewLI = nullptr;
1419
StoreInst *ThrewResetSI = nullptr;
1420
for (auto I = std::next(BasicBlock::iterator(CI)), IE = BB->end();
1421
I != IE; ++I) {
1422
if (auto *LI = dyn_cast<LoadInst>(I))
1423
if (auto *GV = dyn_cast<GlobalVariable>(LI->getPointerOperand()))
1424
if (GV == ThrewGV) {
1425
Threw = ThrewLI = LI;
1426
break;
1427
}
1428
}
1429
// Search for the store instruction after the load above
1430
// __THREW__ = 0;
1431
for (auto I = std::next(BasicBlock::iterator(ThrewLI)), IE = BB->end();
1432
I != IE; ++I) {
1433
if (auto *SI = dyn_cast<StoreInst>(I)) {
1434
if (auto *GV = dyn_cast<GlobalVariable>(SI->getPointerOperand())) {
1435
if (GV == ThrewGV &&
1436
SI->getValueOperand() == getAddrSizeInt(&M, 0)) {
1437
ThrewResetSI = SI;
1438
break;
1439
}
1440
}
1441
}
1442
}
1443
assert(Threw && ThrewLI && "Cannot find __THREW__ load after invoke");
1444
assert(ThrewResetSI && "Cannot find __THREW__ store after invoke");
1445
Tail = SplitBlock(BB, ThrewResetSI->getNextNode());
1446
1447
} else {
1448
// Wrap call with invoke wrapper and generate preamble/postamble
1449
Threw = wrapInvoke(CI);
1450
ToErase.push_back(CI);
1451
Tail = SplitBlock(BB, CI->getNextNode());
1452
1453
// If exception handling is enabled, the thrown value can be not a
1454
// longjmp but an exception, in which case we shouldn't silently ignore
1455
// exceptions; we should rethrow them.
1456
// __THREW__'s value is 0 when nothing happened, 1 when an exception is
1457
// thrown, other values when longjmp is thrown.
1458
//
1459
// if (%__THREW__.val == 1)
1460
// goto %eh.rethrow
1461
// else
1462
// goto %normal
1463
//
1464
// eh.rethrow: ;; Rethrow exception
1465
// %exn = call @__cxa_find_matching_catch_2() ;; Retrieve thrown ptr
1466
// __resumeException(%exn)
1467
//
1468
// normal:
1469
// <-- Insertion point. Will insert sjlj handling code from here
1470
// goto %tail
1471
//
1472
// tail:
1473
// ...
1474
if (supportsException(&F) && canThrow(Callee)) {
1475
// We will add a new conditional branch. So remove the branch created
1476
// when we split the BB
1477
ToErase.push_back(BB->getTerminator());
1478
1479
// Generate rethrow.exn BB once and share it within the function
1480
if (!RethrowExnBB) {
1481
RethrowExnBB = BasicBlock::Create(C, "rethrow.exn", &F);
1482
IRB.SetInsertPoint(RethrowExnBB);
1483
CallInst *Exn =
1484
IRB.CreateCall(getFindMatchingCatch(M, 0), {}, "exn");
1485
IRB.CreateCall(ResumeF, {Exn});
1486
IRB.CreateUnreachable();
1487
}
1488
1489
IRB.SetInsertPoint(CI);
1490
BasicBlock *NormalBB = BasicBlock::Create(C, "normal", &F);
1491
Value *CmpEqOne =
1492
IRB.CreateICmpEQ(Threw, getAddrSizeInt(&M, 1), "cmp.eq.one");
1493
IRB.CreateCondBr(CmpEqOne, RethrowExnBB, NormalBB);
1494
1495
IRB.SetInsertPoint(NormalBB);
1496
IRB.CreateBr(Tail);
1497
BB = NormalBB; // New insertion point to insert __wasm_setjmp_test()
1498
}
1499
}
1500
1501
// We need to replace the terminator in Tail - SplitBlock makes BB go
1502
// straight to Tail, we need to check if a longjmp occurred, and go to the
1503
// right setjmp-tail if so
1504
ToErase.push_back(BB->getTerminator());
1505
1506
// Generate a function call to __wasm_setjmp_test function and
1507
// preamble/postamble code to figure out (1) whether longjmp
1508
// occurred (2) if longjmp occurred, which setjmp it corresponds to
1509
Value *Label = nullptr;
1510
Value *LongjmpResult = nullptr;
1511
BasicBlock *EndBB = nullptr;
1512
wrapTestSetjmp(BB, CI->getDebugLoc(), Threw, FunctionInvocationId, Label,
1513
LongjmpResult, CallEmLongjmpBB, CallEmLongjmpBBThrewPHI,
1514
CallEmLongjmpBBThrewValuePHI, EndBB);
1515
assert(Label && LongjmpResult && EndBB);
1516
1517
// Create switch instruction
1518
IRB.SetInsertPoint(EndBB);
1519
IRB.SetCurrentDebugLocation(EndBB->back().getDebugLoc());
1520
SwitchInst *SI = IRB.CreateSwitch(Label, Tail, SetjmpRetPHIs.size());
1521
// -1 means no longjmp happened, continue normally (will hit the default
1522
// switch case). 0 means a longjmp that is not ours to handle, needs a
1523
// rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1524
// 0).
1525
for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1526
SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1527
SetjmpRetPHIs[I]->addIncoming(LongjmpResult, EndBB);
1528
}
1529
1530
// We are splitting the block here, and must continue to find other calls
1531
// in the block - which is now split. so continue to traverse in the Tail
1532
BBs.push_back(Tail);
1533
}
1534
}
1535
1536
for (Instruction *I : ToErase)
1537
I->eraseFromParent();
1538
}
1539
1540
static BasicBlock *getCleanupRetUnwindDest(const CleanupPadInst *CPI) {
1541
for (const User *U : CPI->users())
1542
if (const auto *CRI = dyn_cast<CleanupReturnInst>(U))
1543
return CRI->getUnwindDest();
1544
return nullptr;
1545
}
1546
1547
// Create a catchpad in which we catch a longjmp's env and val arguments, test
1548
// if the longjmp corresponds to one of setjmps in the current function, and if
1549
// so, jump to the setjmp dispatch BB from which we go to one of post-setjmp
1550
// BBs. Refer to 4) of "Wasm setjmp/longjmp handling" section in the comments at
1551
// top of the file for details.
1552
void WebAssemblyLowerEmscriptenEHSjLj::handleLongjmpableCallsForWasmSjLj(
1553
Function &F, Instruction *FunctionInvocationId,
1554
SmallVectorImpl<PHINode *> &SetjmpRetPHIs) {
1555
Module &M = *F.getParent();
1556
LLVMContext &C = F.getContext();
1557
IRBuilder<> IRB(C);
1558
1559
// A function with catchswitch/catchpad instruction should have a personality
1560
// function attached to it. Search for the wasm personality function, and if
1561
// it exists, use it, and if it doesn't, create a dummy personality function.
1562
// (SjLj is not going to call it anyway.)
1563
if (!F.hasPersonalityFn()) {
1564
StringRef PersName = getEHPersonalityName(EHPersonality::Wasm_CXX);
1565
FunctionType *PersType =
1566
FunctionType::get(IRB.getInt32Ty(), /* isVarArg */ true);
1567
Value *PersF = M.getOrInsertFunction(PersName, PersType).getCallee();
1568
F.setPersonalityFn(
1569
cast<Constant>(IRB.CreateBitCast(PersF, IRB.getPtrTy())));
1570
}
1571
1572
// Use the entry BB's debugloc as a fallback
1573
BasicBlock *Entry = &F.getEntryBlock();
1574
DebugLoc FirstDL = getOrCreateDebugLoc(&*Entry->begin(), F.getSubprogram());
1575
IRB.SetCurrentDebugLocation(FirstDL);
1576
1577
// Add setjmp.dispatch BB right after the entry block. Because we have
1578
// initialized functionInvocationId in the entry block and split the
1579
// rest into another BB, here 'OrigEntry' is the function's original entry
1580
// block before the transformation.
1581
//
1582
// entry:
1583
// functionInvocationId initialization
1584
// setjmp.dispatch:
1585
// switch will be inserted here later
1586
// entry.split: (OrigEntry)
1587
// the original function starts here
1588
BasicBlock *OrigEntry = Entry->getNextNode();
1589
BasicBlock *SetjmpDispatchBB =
1590
BasicBlock::Create(C, "setjmp.dispatch", &F, OrigEntry);
1591
cast<BranchInst>(Entry->getTerminator())->setSuccessor(0, SetjmpDispatchBB);
1592
1593
// Create catch.dispatch.longjmp BB and a catchswitch instruction
1594
BasicBlock *CatchDispatchLongjmpBB =
1595
BasicBlock::Create(C, "catch.dispatch.longjmp", &F);
1596
IRB.SetInsertPoint(CatchDispatchLongjmpBB);
1597
CatchSwitchInst *CatchSwitchLongjmp =
1598
IRB.CreateCatchSwitch(ConstantTokenNone::get(C), nullptr, 1);
1599
1600
// Create catch.longjmp BB and a catchpad instruction
1601
BasicBlock *CatchLongjmpBB = BasicBlock::Create(C, "catch.longjmp", &F);
1602
CatchSwitchLongjmp->addHandler(CatchLongjmpBB);
1603
IRB.SetInsertPoint(CatchLongjmpBB);
1604
CatchPadInst *CatchPad = IRB.CreateCatchPad(CatchSwitchLongjmp, {});
1605
1606
// Wasm throw and catch instructions can throw and catch multiple values, but
1607
// that requires multivalue support in the toolchain, which is currently not
1608
// very reliable. We instead throw and catch a pointer to a struct value of
1609
// type 'struct __WasmLongjmpArgs', which is defined in Emscripten.
1610
Instruction *LongjmpArgs =
1611
IRB.CreateCall(CatchF, {IRB.getInt32(WebAssembly::C_LONGJMP)}, "thrown");
1612
Value *EnvField =
1613
IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 0, "env_gep");
1614
Value *ValField =
1615
IRB.CreateConstGEP2_32(LongjmpArgsTy, LongjmpArgs, 0, 1, "val_gep");
1616
// void *env = __wasm_longjmp_args.env;
1617
Instruction *Env = IRB.CreateLoad(IRB.getPtrTy(), EnvField, "env");
1618
// int val = __wasm_longjmp_args.val;
1619
Instruction *Val = IRB.CreateLoad(IRB.getInt32Ty(), ValField, "val");
1620
1621
// %label = __wasm_setjmp_test(%env, functionInvocatinoId);
1622
// if (%label == 0)
1623
// __wasm_longjmp(%env, %val)
1624
// catchret to %setjmp.dispatch
1625
BasicBlock *ThenBB = BasicBlock::Create(C, "if.then", &F);
1626
BasicBlock *EndBB = BasicBlock::Create(C, "if.end", &F);
1627
Value *EnvP = IRB.CreateBitCast(Env, getAddrPtrType(&M), "env.p");
1628
Value *Label = IRB.CreateCall(WasmSetjmpTestF, {EnvP, FunctionInvocationId},
1629
OperandBundleDef("funclet", CatchPad), "label");
1630
Value *Cmp = IRB.CreateICmpEQ(Label, IRB.getInt32(0));
1631
IRB.CreateCondBr(Cmp, ThenBB, EndBB);
1632
1633
IRB.SetInsertPoint(ThenBB);
1634
CallInst *WasmLongjmpCI = IRB.CreateCall(
1635
WasmLongjmpF, {Env, Val}, OperandBundleDef("funclet", CatchPad));
1636
IRB.CreateUnreachable();
1637
1638
IRB.SetInsertPoint(EndBB);
1639
// Jump to setjmp.dispatch block
1640
IRB.CreateCatchRet(CatchPad, SetjmpDispatchBB);
1641
1642
// Go back to setjmp.dispatch BB
1643
// setjmp.dispatch:
1644
// switch %label {
1645
// label 1: goto post-setjmp BB 1
1646
// label 2: goto post-setjmp BB 2
1647
// ...
1648
// default: goto splitted next BB
1649
// }
1650
IRB.SetInsertPoint(SetjmpDispatchBB);
1651
PHINode *LabelPHI = IRB.CreatePHI(IRB.getInt32Ty(), 2, "label.phi");
1652
LabelPHI->addIncoming(Label, EndBB);
1653
LabelPHI->addIncoming(IRB.getInt32(-1), Entry);
1654
SwitchInst *SI = IRB.CreateSwitch(LabelPHI, OrigEntry, SetjmpRetPHIs.size());
1655
// -1 means no longjmp happened, continue normally (will hit the default
1656
// switch case). 0 means a longjmp that is not ours to handle, needs a
1657
// rethrow. Otherwise the index is the same as the index in P+1 (to avoid
1658
// 0).
1659
for (unsigned I = 0; I < SetjmpRetPHIs.size(); I++) {
1660
SI->addCase(IRB.getInt32(I + 1), SetjmpRetPHIs[I]->getParent());
1661
SetjmpRetPHIs[I]->addIncoming(Val, SetjmpDispatchBB);
1662
}
1663
1664
// Convert all longjmpable call instructions to invokes that unwind to the
1665
// newly created catch.dispatch.longjmp BB.
1666
SmallVector<CallInst *, 64> LongjmpableCalls;
1667
for (auto *BB = &*F.begin(); BB; BB = BB->getNextNode()) {
1668
for (auto &I : *BB) {
1669
auto *CI = dyn_cast<CallInst>(&I);
1670
if (!CI)
1671
continue;
1672
const Value *Callee = CI->getCalledOperand();
1673
if (!canLongjmp(Callee))
1674
continue;
1675
if (isEmAsmCall(Callee))
1676
report_fatal_error("Cannot use EM_ASM* alongside setjmp/longjmp in " +
1677
F.getName() +
1678
". Please consider using EM_JS, or move the "
1679
"EM_ASM into another function.",
1680
false);
1681
// This is __wasm_longjmp() call we inserted in this function, which
1682
// rethrows the longjmp when the longjmp does not correspond to one of
1683
// setjmps in this function. We should not convert this call to an invoke.
1684
if (CI == WasmLongjmpCI)
1685
continue;
1686
LongjmpableCalls.push_back(CI);
1687
}
1688
}
1689
1690
for (auto *CI : LongjmpableCalls) {
1691
// Even if the callee function has attribute 'nounwind', which is true for
1692
// all C functions, it can longjmp, which means it can throw a Wasm
1693
// exception now.
1694
CI->removeFnAttr(Attribute::NoUnwind);
1695
if (Function *CalleeF = CI->getCalledFunction())
1696
CalleeF->removeFnAttr(Attribute::NoUnwind);
1697
1698
// Change it to an invoke and make it unwind to the catch.dispatch.longjmp
1699
// BB. If the call is enclosed in another catchpad/cleanuppad scope, unwind
1700
// to its parent pad's unwind destination instead to preserve the scope
1701
// structure. It will eventually unwind to the catch.dispatch.longjmp.
1702
SmallVector<OperandBundleDef, 1> Bundles;
1703
BasicBlock *UnwindDest = nullptr;
1704
if (auto Bundle = CI->getOperandBundle(LLVMContext::OB_funclet)) {
1705
Instruction *FromPad = cast<Instruction>(Bundle->Inputs[0]);
1706
while (!UnwindDest) {
1707
if (auto *CPI = dyn_cast<CatchPadInst>(FromPad)) {
1708
UnwindDest = CPI->getCatchSwitch()->getUnwindDest();
1709
break;
1710
}
1711
if (auto *CPI = dyn_cast<CleanupPadInst>(FromPad)) {
1712
// getCleanupRetUnwindDest() can return nullptr when
1713
// 1. This cleanuppad's matching cleanupret uwninds to caller
1714
// 2. There is no matching cleanupret because it ends with
1715
// unreachable.
1716
// In case of 2, we need to traverse the parent pad chain.
1717
UnwindDest = getCleanupRetUnwindDest(CPI);
1718
Value *ParentPad = CPI->getParentPad();
1719
if (isa<ConstantTokenNone>(ParentPad))
1720
break;
1721
FromPad = cast<Instruction>(ParentPad);
1722
}
1723
}
1724
}
1725
if (!UnwindDest)
1726
UnwindDest = CatchDispatchLongjmpBB;
1727
changeToInvokeAndSplitBasicBlock(CI, UnwindDest);
1728
}
1729
1730
SmallVector<Instruction *, 16> ToErase;
1731
for (auto &BB : F) {
1732
if (auto *CSI = dyn_cast<CatchSwitchInst>(BB.getFirstNonPHI())) {
1733
if (CSI != CatchSwitchLongjmp && CSI->unwindsToCaller()) {
1734
IRB.SetInsertPoint(CSI);
1735
ToErase.push_back(CSI);
1736
auto *NewCSI = IRB.CreateCatchSwitch(CSI->getParentPad(),
1737
CatchDispatchLongjmpBB, 1);
1738
NewCSI->addHandler(*CSI->handler_begin());
1739
NewCSI->takeName(CSI);
1740
CSI->replaceAllUsesWith(NewCSI);
1741
}
1742
}
1743
1744
if (auto *CRI = dyn_cast<CleanupReturnInst>(BB.getTerminator())) {
1745
if (CRI->unwindsToCaller()) {
1746
IRB.SetInsertPoint(CRI);
1747
ToErase.push_back(CRI);
1748
IRB.CreateCleanupRet(CRI->getCleanupPad(), CatchDispatchLongjmpBB);
1749
}
1750
}
1751
}
1752
1753
for (Instruction *I : ToErase)
1754
I->eraseFromParent();
1755
}
1756
1757