---
---
Study tips and problem solving {.title}
How to use the material and how to solve common problems.
Using these materials
The material is organized by topics of increasing complexity
Feel free to jump if you know the basics already
Read the slides / watch the videos first
Complete the tutorials to make sure you've got the steps right
Try out one or more of the exercises to verify your new skills
If you get stuck, consult Docs CSC linked to the topic slides
Press
shift
to open links with additional information in a new windowLeft-click on slides and you can then navigate them with the arrow keys
General problem solving
Go to docs.csc.fi and check in the right section in the navigation
Try the FAQ
Try the search function in CSC Docs or search the web
Type a keyword in CSC Docs, copy/paste the error message in your favorite search engine
Send an email to [email protected] containing:
A descriptive title
What you wanted to achieve and on which which computer
Which commands you have given
What error messages resulted
Running a new application in Puhti 1/2
If it comes with tutorials, do at least one
This will likely be the fastest way forward
Naturally, read the manual/instructions
Check if there's a page about it in Docs CSC
If there is, use the batch script example from there
Otherwise, use a general template
Try first running interactively (not on a login node)
Perhaps it is easier to find the correct command line options
Use the
top
command to get rough estimate of memory use, etc.If developers provide some test or example data, run it first and make sure results are correct
Running a new application in Puhti 2/2
You can use the test queue to check that your batch job script is correct
Limits : 15 min, 2 nodes
Job turnaround usually very fast even if machine is "full"
Can be useful to spot typos, missing files, etc. before submitting a job that will idle in the queue
Before large runs, it's a good idea to do a smaller trial run
Check that results are as expected
Check the resource usage after the test run and adjust accordingly
How many cores to allocate?
This depends on many things, so you have to try, see our performance checklist
What if your job fails? Troubleshooting checklist 1/2
Did the job run out of time?
Did the job run out of memory?
Did the job actually use the resources you specified?
Problems in the batch job script can cause parameters to be ignored and default values are used instead
Did it fail immediately or did it run for some time?
Jobs failing immediately are often due to something simple like typos, missing inputs, bad parameters, etc.
What if your job fails? Troubleshooting checklist 2/2
Check the error file captured by the batch job script
Check any other error files and logs the your program may have produced
Error messaged can sometimes be long, cryptic and a bit intimidating, but ...
Try skimming through them and see if you can spot something "human-readable"
Often you can spot the actual problem easily if you go through the whole message. Something like "required input file so-and-so missing" or "parameter X out of range", etc.
Consult the FAQ on common Slurm issues in the CSC Docs
Document your discoveries
When you've successfully solved an issue, make it easy to rediscover it
Set up a file in your
$HOME
and add your commands thereIt's quick to copy/paste from the screen to the end of the file
... and finding them with
grep
later is quick (grep them $HOME/vault
)bash
history is nice, but it keeps also the commands that didn't work...Note, don't overwrite your vault file (e.g., with
cat > $HOME/vault
)
Store scripts in
$HOME/bin
and take backups