guide

Debugging with the use of log files

When developing applications on Golem, sooner or later you will get to a point where something is not working as expected. Executing commands on the provider-hosted container seems to be one of the most error-prone areas. If something is going wrong on the provider side, we need to have a tool that helps us in the identification of the problem. When we know what the error details are, it usually is much easier to correct our code and eliminate the bug.

Introducing log files

The good news is that Golem's high-level API supports log files. The log files contain all sorts of interesting stuff. Among others you will find there:

  • Requestor > Provider file transfer details
  • Output streams of the commands executed on the Provider: stdout and stderr
  • Provider > Requestor file transfer details

How to enable logging?

The logging infrastructure in Golem's high-level API is quite universal and thus somewhat complex, but for most of the cases using the SummaryLogger will do the job.

The SummaryLoggeris an event listener that listens to all the atomic events and combines them into the more aggregated track of events that is easier to grasp by humans.

To enable logging to a file we need two things.

enable_default_logger(log_file=args.log_file)

1. Theenable_default_logger call which enables the default logger, which:

  • outputs to the Requestor application's stderr all the log messages with the level INFO
  • if the log_file is specified, outputs to the file specified all the log messages with the level INFO and DEBUG

2. Then in the Executorwe add:

async with Executor(
    ...
    event_consumer=log_summary(log_event_repr),
    ...
) as executor:

Here, the log_summary creates SummaryLogger instance that is passed as event consumer to the Executor we are creating.

Let's have an error

We are going to simulate some bug in the code now. We will take a fully working piece of code, modify it a bit (create the bug in the code that produces and error on the provider's side) and then, by analyzing the log file, we will track what the root cause of the problem is. In the end, we will fix the code and check the details of its work with the analysis of the log file.

Let's look at the code of yacat that is part of the Golem high-level Python API repository. All the code that enables logging is already there. We just need to use --log-file command-line switch.

In line 29 we are generating the body of keyspace.sh script to be executed on the provider:

Now we are simulating a typo in the code. Instead of:

hashcat --keyspace -a 3 {mask} -m 400

let's have the following code:

hashcat --keeyspace -a 3 {mask} -m 400

So, just an additional e character that has passed unnoticed. A typical developer day.

Reading the log file

Let's try to track the effect of our typo in the contents of log.txt - the file that has been generated by yacat as a result of us using --log-file log.txt.

First, let's find the entry corresponding with the first ctx.run call. We can do it by searching the log.txt file (from the top of the file, for example by hitting Ctrl-F in our favourite file viewer) in search of

command={'run'

This way, we can find a log entry that looks like this:

[2021-01-22 15:20:16,273 DEBUG yapapi.events]
CommandStarted(agr_id='b39318a4c682010ea9be37201b8f428fad81b28cfe949ca11cca9912a465d34c',
task_id='1', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args':
['/golem/work/keyspace.sh'], 'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}})

Now, below it, there are the log entries we were looking for:

[2021-01-22 15:20:16,383 DEBUG yapapi.events]
CommandStdErr(agr_id='b39318a4c682010ea9be37201b8f428fad81b28cfe949ca11cca9912a465d34c',
task_id='1', cmd_idx=3,
output="hashcat: unrecognized option '--keeyspace'\n\x1b[31mInvalid argument specified.\x1b[0m\n\n")
[2021-01-22 15:20:16,385 DEBUG yapapi.events]
CommandExecuted(agr_id='b39318a4c682010ea9be37201b8f428fad81b28cfe949ca11cca9912a465d34c',
task_id='1', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args':
('/golem/work/keyspace.sh',), 'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}},
success=False, message='ExeScript command exited with code 255')

There are two pieces of those log entries that are the most important to us:

hashcat: unrecognized option '--keeyspace'\n\x1b[31mInvalid argument specified.\x1b[0m\n

This is a stderr message generated by running the hashcat binary. It seems that hashcat does not recognize the --keeyspace parameter. This is indeed the error we simulated.

Congratulations! You have just tracked the bug by analyzing the requestor log file!

The other interesting piece of information here is:

ExeScript command exited with code 255

From this message, we know that the effect of this ctx.run resulted in the exit code having a value of 255 which has been returned by hashcat on exit. If there was any additional information in the command exit code, we would be able to see it here.

The 255 value, we have in the log file here is generated by equivalent of POSIX exit(-1) executed on the hashcat termination. The -1 was converted to unsigned int and came back as255.

Fixing the error

Now, we can use that knowledge to fix the write_keyspace_check_script function by bringing it back to the correct form:

def write_keyspace_check_script(mask):
    with open("keyspace.sh", "w") as f:
        f.write(f"hashcat --keyspace -a 3 {mask} -m 400 > /golem/work/keyspace.txt")

After saving the corrected yacat.py file and executing with the same command as above:

python yacat.py '?a?a?a' '$P$5ZDzPE45CLLhEx/72qt3NehVzwN2Ry/' --log-file log.txt
python yacat.py ?a?a?a $P$5ZDzPE45CLLhEx/72qt3NehVzwN2Ry/ --log-file log.txt

the problematic log fragment we analyzed above, becomes:

[2021-01-22 16:22:17,217 DEBUG yapapi.events]
CommandStarted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args':
['/golem/work/keyspace.sh'], 'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}})
[2021-01-22 16:22:18,303 DEBUG yapapi.events]
CommandExecuted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args': ('/golem/work/keyspace.sh',),
'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}}, success=True, message=None)

As you can see, we have success=True there, which means that the ctx.run execution was successful.

Congratulations! The bug has been corrected!

Log files - what else is there?

In the log file, there is all the information you are already familiar with, because it is displayed as INFO messages in the stdout of the requestor application examples. Between the familiar INFO entries, you will find additional DEBUG messages. Let's have a look at the most interesting ones.

Requestor > Provider file transfer details

The transfers of files from the Requestor to the Provider can be identified in the log as pairs of CommandStarted and CommandExecuted lines similar to:

[2021-01-22 16:22:16,773 DEBUG yapapi.events]
CommandStarted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=2, command={'transfer': {'from':
'gftp://0xf378a153b24fe98227ee153a0cb056faed155ba7/050a02b6976e59fbc1b9dfa6292d4f63942a820ee7dcc859af0a5efaea80c959',
'to': 'container:/golem/work/keyspace.sh', 'format': None, 'depth': None, 'fileset': None}})
[2021-01-22 16:22:17,217 DEBUG yapapi.events]
CommandExecuted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=2, command={'transfer': {'from':
'gftp://0xf378a153b24fe98227ee153a0cb056faed155ba7/050a02b6976e59fbc1b9dfa6292d4f63942a820ee7dcc859af0a5efaea80c959',
'to': 'container:/golem/work/keyspace.sh'}}, success=True, message=None)

The source file name is identified by

'from':'gftp://0xf378a153b24fe98227ee153a0cb056faed155ba7/050a...'

The gftp is internal Golem protocol for transferring files in the Golem network.

The target file name is identified by

'to': 'container:/golem/work/keyspace.sh'

success=True informs us that the operation was successful.

stdout of executing commands on the Provider

Any execution of the ctx.run function - which is what we use to run commands inside provider's execution units - will result in the following entries in the log file: a CommandStarted, followed by any number of CommandStdOut depending on the output produced by the command and finally, one CommandExecuted. For example, this is the result of yacat's main ctx.run :

[2021-01-22 16:22:36,986 DEBUG yapapi.events]
CommandStarted(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args':
['-c', 'rm -f /golem/work/*.potfile ~/.hashcat/hashcat.potfile; touch /golem/work/hashcat_0.potfile;
hashcat -a 3 -m 400 /golem/work/in.hash ?a?a?a --skip=0 --limit=3009 --self-test-disable -o
/golem/work/hashcat_0.potfile || true'], 'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}})
[2021-01-22 16:22:37,018 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='hashcat (v5.1.0) starting...\n\n')
[2021-01-22 16:22:37,018 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rParsed Hashes: 1/1 (100.00%)\x1b[2K\rParsed Hashes: 1/1 (100.00%)')
[2021-01-22 16:22:37,018 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rCounted lines in /golem/work/in.hash...')
[2021-01-22 16:22:37,019 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='Counting lines in /golem/work/in.hash...')
[2021-01-22 16:22:37,019 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='OpenCL Platform #1: Intel(R) Corporation\n========================================\n*
Device #1: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz, 5347/21388 MB allocatable, 4MCU\n\n')
[2021-01-22 16:22:37,048 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rSorting hashes...\x1b[2K\rSorted hashes...\x1b[2K\r
Removing duplicate hashes...\x1b[2K\rRemoved duplicate hashes...\x1b[2K\rSorting salts...\x1b[2K\rSorted salts...\x1b[2K\r
Comparing hashes with potfile entries...\x1b[2K\rCompared hashes with potfile entries...')
[2021-01-22 16:22:37,048 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rGenerated bitmap tables...')
[2021-01-22 16:22:37,048 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rGenerating bitmap tables...')
[2021-01-22 16:22:37,963 DEBUG yapapi.events]
CommandStdOut(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, output='\x1b[2K\rHashes: 1 digests; 1 unique digests, 1 unique salts\n
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates\n\nApplicable optimizers:\n*
Zero-Byte\n* Single-Hash\n* Single-Salt\n* Brute-Force\n\nMinimum password length supported by kernel: 0\n
Maximum password length supported by kernel: 256\n\n\x1b[33mATTENTION! Pure (unoptimized)
OpenCL kernels selected.\x1b[0m\n\x1b[33mThis enables cracking passwords and salts > length 32 but for the
price of drastically reduced performance.\x1b[0m\n\x1b[33mIf you want to switch to optimized OpenCL kernels,
append -O to your commandline.\x1b[0m\n\x1b[33m\x1b[0m\nWatchdog: Hardware monitoring interface not found
on your system.\nWatchdog: Temperature abort trigger disabled.\n\n
Initializing device kernels and memory...\x1b[2K\rInitializing OpenCL runtime for device #1...')

...

[2021-01-22 16:22:40,649 DEBUG yapapi.events]
CommandExecuted(agr_id='ac6c0ea05bd668ee6afc5d2cf255e3f6ac40eabe78b4a313e18d521eaa07dfa4',
task_id='2', cmd_idx=3, command={'run': {'entry_point': '/bin/sh', 'args':
('-c', 'rm -f /golem/work/*.potfile ~/.hashcat/hashcat.potfile; touch /golem/work/hashcat_0.potfile;
hashcat -a 3 -m 400 /golem/work/in.hash ?a?a?a --skip=0 --limit=3009 --self-test-disable -o
/golem/work/hashcat_0.potfile || true'), 'capture': {'stdout': {'stream': {}}, 'stderr': {'stream': {}}}}}, success=True, message=None)

The output parts of the CommandStdOut contain the stdout of the ctx.run. For example the beginning of the hashcat execution can be traced by output='hashcat (v5.1.0) starting...\n\n' .

Provider > Requestor file transfer details

Getting files back from the provider to the requestor will leave a trace such as:

[2021-01-22 16:22:18,304 DEBUG yapapi.events]
CommandStarted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=4, command={'transfer': {'from': 'container:/golem/work/keyspace.txt', 'to':
'gftp://0xf378a153b24fe98227ee153a0cb056faed155ba7/0GdLl2vUcEYgO7ExgkVuttc5oqeAmU90Im911Eqs5LZXzD9ehlYCUGH7p5oII9jSg',
'format': None, 'depth': None, 'fileset': None}})
[2021-01-22 16:22:18,513 DEBUG yapapi.events]
CommandExecuted(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4',
task_id='1', cmd_idx=4, command={'transfer': {'from': 'container:/golem/work/keyspace.txt', 'to':
'gftp://0xf378a153b24fe98227ee153a0cb056faed155ba7/0GdLl2vUcEYgO7ExgkVuttc5oqeAmU90Im911Eqs5LZXzD9ehlYCUGH7p5oII9jSg'}},
success=True, message=None)
[2021-01-22 16:22:18,517 DEBUG yapapi.events]
GettingResults(agr_id='bf8c484e362af417cbf5fc440281debd5c5cf772b433f53e0ecb90b4faac55f4', task_id='1')
[2021-01-22 16:22:18,519 DEBUG yapapi.events] DownloadStarted(path='/golem/work/keyspace.txt')
[2021-01-22 16:22:18,521 DEBUG yapapi.events] DownloadFinished(path='keyspace.txt')

The source filename is identified by 'from': 'container:/golem/work/keyspace.txt' and the target filename is identified by DownloadFinished(path='keyspace.txt').

Docker exec - an alternative approach

Analyzing the requestor logs gives us the full picture of what is going on under the hood of Golem, but in some cases, it might be easier to start with executing a simple docker execthat runs commands on the container. The syntax of this docker command is:

docker exec [OPTIONS] CONTAINER COMMAND [ARG...]

This command can be used to execute the ctx.run scripts directly on the provider container. This is regular Docker usage, so we'll recommend you consulting the Docker's reference: https://docs.docker.com/engine/reference/commandline/exec/