"Fossies" - the Fresh Open Source Software Archive

Member "hashcat-6.2.6/docs/hashcat-plugin-development-guide.md" (2 Sep 2022, 121778 Bytes) of package /linux/privat/hashcat-6.2.6.tar.gz:

As a special service "Fossies" has tried to format the requested source page into HTML format (assuming markdown format). Alternatively you can here view or download the uninterpreted source code file. A member file download can also be achieved by clicking within a package contents listing on the according byte size field. See also the latest Fossies "Diffs" side-by-side code changes report for "hashcat-plugin-development-guide.md": 6.2.5_vs_6.2.6.

A hint: This file contains one or more very long lines, so maybe it is better readable using the pure text view mode that shows the contents as wrapped lines within the browser window.

Hashcat Plugin Development Guide

The purpose of this document is to introduce you to the development of plugins for hashcat 6.0.0 and newer. We will update this document regularly and add more detailed content. The content in its current state includes enough details to write easy, medium and hard plugins.

With hashcat 6.0.0, a new interface has been designed which enables you to add new hash-modes more easily than in older hashcat versions. The plugin interface is an essential new feature of hashcat 6.0.0.

One of our goals was to have the new interface to be independent from future versions of hashcat. This is achieved by hashcat loading your plugin code dynamically from a .so/.dll/.dylib library on startup. Another goal was to give the author of the plugin the option to share the plugin as source or in binary form. This is achieved by a clear separation between hashcat core code and plugin code. There is no longer a need to change hashcat core sources in order to add a new hash-mode. All existing hash-modes (300+) from older hashcat versions have been refactored to this new interface.

We are well aware that as a developer you want to see as little change as possible on this interface. That is why our third goal was to get the interface to a fairly final state and minimize the risk of changing it once it is released. That is not an easy task. When you are designing such an interface, there is always a chance that you are missing some details for rare use cases. The refactorization of the 300+ existing hash-modes served both as a reference check and a feasibility study. We do not plan to change the interface except if there is a strong need for it. For that unlikely event of a major change, there is an automatic version check which is added automatically to your module at compile time.

To make kernel development as easy as possible, we have already started in previous hashcat versions to include GPU-optimized OpenSSL-like crypto interfaces and finalized it with hashcat 6.0.0. If you are familiar with that interface, you know it typically uses a chain of context init(), update() and final() function calls. In all refactored pure kernel sources, you can see this interface type design being used. It is also our hope that the structure of the well known interface will make it easy for developers to use the existing kernel source as a useful reference.

Developing a hashcat plugin can be very overwhelming at first. Do not get discouraged by it. After the first plugin, you will already feel practiced - and you will soon realize that the development steps are always the same.

Plugin Structure

Let us jump right in. To develop a plugin for hashcat, you basically just need to create two files:

You will read the terms "module" and "kernel" quite often from now on. Just for terminology, the combination of both "module" and "kernel" is what we call a hashcat "plugin".

There is an -optional- third file: The unit-test stub. In this file you can implement the crypto scheme of your hash-mode from a "high-level" perspective. The stub is then called from hashcat's own testing suite. The goal is to test your hashcat plugin implementation by comparing the results of the unit-test stub with the results created from hashcat itself while using your plugin. It will automatically generate random passwords, salts, hashes, etc. for you and compare everything in very deep detail - so you can be sure your plugin implementation works in all different attack modes and most importantly also in some corner cases that might exist.

Before the code

You need to code in the C language. If you are a C beginner, this may be a bit too hard, but if you have programming skills in C or if you have crypto programming skills in a different programming language, you should be able to write a hashcat plugin. While you have this documentation as a reference, it won't give you all the information (rarely used module functions or kernel function etc) you need. Be prepared to study existing code from other plugins for information.

Rule number one: It is pointless to start developing a plugin if you do not have a deep understanding of the algorithm which you want to implement. Writing a plugin is a multi-layered process which you have to approach step by step. You never write the plugin from start to finish in a giant leap. Since we are working on a very low level, there is a big need for small milestones where you can stop and control intermediate values. For instance, if you want to implement a hash-mode which does md5(sha1(p)), thenyouimplementsha1(p) first. At this point it makes sense to add a milestone to control the intermediate hash before you continue implementing the md5(). But where do you get these intermediate control values from? The answer is simple: from proof-of-concept code which you already have or write yourself first. It does not matter which language that this POC is written in, as long as you know how to breakpoint or how to manipulate the POC in order to print the wanted intermediate values.

One more thing. You need to make an essential decision before you start with your implementation. You need to categorize your algorithm beforehand. Based on the details of the algorithm, it is either a so-called "fast" or a "slow" kernel type. This decision cannot be changed easily afterwards, so take your time. Do not worry, the right answer to this decision is simple and you can derive it from some algorithm details.

Rule of thumb:

Otherwise you probably want to develop a "fast" kernel. Note that the most crypto primitives that would need to be implemented as a fast kernel already are implemented. However, if you actually want to write a "fast" kernel the main goal is to workaround the PCI Express bottleneck. For a more detailed explanation on how to calculate the bandwidth please have a look here: https://hashcat.net/wiki/doku.php?id=frequently_asked_questions#does_the_pci-express_speed_have_any_influence_on_cracking_speed. To workaround this bottleneck, you need to write attack-mode specific kernels. These kernels have a very special structure, but from a development perspective you just write your code in a "block" inside another level of a for() loop, the remaining parts are similar to what is used in a typical slow kernel.

However, I expect most people reading this document want to write a "slow" kernel. Luckily, writing a slow kernel is easier to start with. Most of the time, you will find the compute-intense code in a kernel you can copy/paste from other kernels or the GPU crypto library and it is just for the final comparison/verification that you need to put in some brain power.

Another preparation you need to make before you start coding is to pick the right hash-mode number. To make it short, there is some logic to this, but it is not easy to explain.

Development Environment

In theory there is no special hardware required for hashcat plugin development. However, there are some recommendations that we can give you:

One of the most important factors for choosing the right compute API is that it supports using printf() from inside the kernel. In the past this way of debugging was not possible, which made kernel development a real pain. With the current OpenCL drivers this works pretty well. Get used to the idea that printf() becomes your primary debugging utility. Since you only write a very small piece of code in the kernel it is not as bad as you may think.

Personally, I like to write my plugins on Linux, but of course you can also use macOS or Windows. All regular runtimes support debugging functionalities on all operating systems.

Some more remarks for the hardware of your development platform:

My Development at the time of writing this document (beginning of 2020) is an Intel I5 generation 6 with a regular SSD and 16GB memory. The system runs on Ubuntu 18.04 Server. The GPU is an NVIDIA GTX 980. Additionally I am using Intel OpenCL runtime, but only to test the code on the CPU afterwards.

Before you actually start with your implementation make sure you have already cloned hashcat from GitHub master, that you are able to compile it on your system and that it runs smoothly. Make sure you have a clean installation with no previous version artifacts laying around.

Test Suite

The optional unit-test stub originally was made only to automate the task of plugin verification. It is written intentionally in a different language (Perl) and not in C. From our perspective this has a lot of advantages and benefits. Despite of the language the input and the output data of any hashing/encryption algorithm should be the same. If they match, then your plugin implementation is very likely to be working correctly.

From our experience in the last years adding new hashcat hash-modes we cannot stress enough how important it is to have a POC (as described earlier) to print intermediate values. If we do not already have some sort of POC, we use this optional unit-test stub as a POC replacement. Writing a unit-test is typically done from a high-level programming language, thus Perl is a good candidate to do so, but there is also some unit-test stubs written in python. At this point we already created some synergy because you can use it as a POC to start with the development and later it acts as a normal unit-test stub and you do not have to write it twice. If you do not care about POC's and unit-test you can directly jump to the module subsection from here.

The Test Suite is a Perl Framework. The main program (tools/test.pl) loads at runtime the hash-mode specific code written like a plugin. The structure of this perl module is standardized. We have already mentioned that all existing code to the 300+ hash-modes from previous hashcat versions have been refactored. Also all 300+ hash-mode specific unit-test stubs have been refactored into this new Test Suite Framework. The same way the before mentioned modules and kernels act as a reference, the unit-test stubs can also be used as reference. In most of the cases you can simply copy/paste from an existing unit-test stubs, change a small piece of code and both are ready, the POC and the unit-test stub.

The test suite itself consists of two files:

The filename of your unit-test stub has to be: tools/test_modules/m[hash_mode].pm


The tools/test.pl Script has three different use cases:

When calling tools/test.pl from the command line, the first parameter you have to give is the use case type. It should be either "single", "passthrough" or "verify".

You need to implement three methods in your unit test stub. Note that the use cases are not directly related to the methods. You need to implement all three, then you can make use of all three use cases:

The second parameter is the hash-mode itself. In case of "verify" you have to give some additional parameters. For the exact syntax please see tools/test.pl --help.

In order to get tools/test.pl running you need to install a lot of perl modules. To help you install them quickly, we have developed a simple script tools/install_modules.sh. You may want to take a look inside before you execute it. At this time, none of the perl modules require a special version which means you can also use the perl modules which your distribution offers to you (if you prefer it that way, for instance the GCrypt perl module with apt install libcrypt-gcrypt-perl on Debian/Ubuntu).

Single Mode

In single mode, a number of random passwords are generated for the selected hash mode. Each of the generated passwords is passed to the module_generate_hash() method (which is one of the methods you have to populate with code) and thus a hash is generated. In the end, both information, password and final hash line (which typically also contains the salt) are output to stdout, so that you can execute the output as if it would be a real shell script. If your hash-mode requires one (or more) salts, this will also be created automatically. The most important thing is that test.pl generates passwords of different lengths, with the guarantee that the minimum and maximum length password are always included.

Attention: The testing suite expects that the module_generate_hash() method will return the output of the final hash line. You have to return this as a string in the exact format that hashcat will later accept.

If your implementation contains optimizations based on the password length (for example 0x80's, zero based options, etc.) then you would also want to verify that such optimizations also work with all possible password lengths. Therefore, another function must exist in your stub: module_constraints().

The module_constraints() method is easy to understand. It returns exactly 5 integer pairs. These pairs always define a range, therefore they consist of a minimum and a maximum number. The order of the pairs is the following:

If you do not need one of the named pairs or the pair does not make sense because it is not applicable, you must use -1 for minimum and maximum. Please note that there is a strong difference between pure and optimized kernels. We have not discussed this concept so far, therefore let us stick to pure kernels. With a few exceptions, slow hash types have no implementation of an optimized mode, because the performance does not drop too much because of register pressure, but because of the iteration count, which you cannot optimize. We will come to the different kernel modes in the kernel section.

Another important note about salts. Often one or more salts are needed. Possible iteration counts, IV or random content data can also be seen here as "salt". This data can be so different that it does not fit into a single policy / interface. Therefore, test.pl cannot standardize this complex situation. For simple forms of salts only, test.pl provides you with a simple form of random salt data. You can specify the length constraints (min/max) of the salt data in the constraints section. In more complex situations, you will not be able to avoid creating your own salts by calling some helper functions that test.pl provides you with, directly in the module_generate_hash() method.


my $iter      = shift // 10000;
my $user_salt = shift // random_hex_string (128)
my $ck_salt   = shift // random_hex_string (128)
my $user_iv   = shift // random_hex_string (32)

Passthrough Mode

In passthrough mode, test.pl expects the passwords from you, quite the opposite of single mode where they were generated automatically. Every password that you send via stdin (e.g. pipe) is passed to the module_generate_hash() method and the resulting hash is sent to stdout. The rest is identical to single mode.


$ echo hashcat | tools/test.pl passthrough 1600

Note: In this specific case the newline character after the password (compare it to echo -n hashcat) is used as a delimiter between the lines/passwords and therefore not considered part of the password (remember that echo hashcat | md5sum for instance produces "wrong" results, in general, because of the extra newline).

Verify Mode

In verify mode you go one step further than in single or in passthrough mode. In two different files you give both, a specific hash and in another file the (same) hash including the matching password. The goal is that the expected hash is generated from the module_generate_hash() method, which is then compared with the input from the first file. If the comparison passed, the original hash is written in a third file.

This is where the module_verify_hash() method is used for the first time. In this method, you have to break down the hash line into its individual parts, especially those components that are absolutely necessary to reconstruct the exact same hash. For instance, if the algorithm needs one or more salts, then this salt must be extracted. Finally you call the module_generate_hash() using the extracted components (salts, iteration counts etc).

At this point we need to go back to the module_generate_hash() function. To make the verification work. It is necessary that your module_generate_hash() function recognizes whether a salt has been newly generated and, if this is the case, only then actually generate a random salt yourself (as described in the single mode).

Here is a real life example from Android Backup plugin (tools/test_modules/m18900.pm)

sub module_generate_hash
  my $masterkey_blob = shift;
  if (defined $masterkey_blob)
    # verify call, write code to use the given salt
    # regular call, write code to generate random salt

The script is called with the following command line parameters:

perl tools/test.pl verify 18900 hash_list.txt cracked_list.txt verified_list.txt

After the command line parameter "verify" the hash mode is specified ("18900" in this example), followed by the original hash list (hash_list.txt) without passwords. After the hashfile the path of the file with the list of cracked hashes, including passwords is given. The format of this file is simply hash[:salt]:password the same way as hashcat would output them. Note that you can have multiple lines. The third parameter specifies the output file. It contains the lines that have been verified as correct and that also appear in the original hash list.

You should also always test that the exit code of test.pl is 0, otherwise it could be that the output file was not overwritten.

The verify mode is an excellent replacement for a missing POC.


The test.sh is an overlay for test.pl, which actually calls the hashcat binary based on the return values from test.pl in single mode (it interacts with both). The test.sh shell script also compares the return values of the hashcat binary with the expected result. This includes tests such as whether all hashes have been cracked, whether the associated password is the correct one and not any other from the test.pl return, whether the output hash is displayed in the correct format, etc.

Furthermore, the script has many different options (when called in the command line) with which you can narrow down to specific tests. You typically want to make use of this feature, because a complete test run across all hash modes can take several days.

The main options:

If the options are not set, attack-mode 0 for hash-mode 0 is executed. To see additional options, see tools/test.sh --help


The first really needed ingredient to create a plugin is the module. The module is a single .c source code file in which you can freely implement the 68 different interface functions or add your own auxiliary functions. No worries, I have never had a module which required me to implement all 68 functions. Many functions are really only required in special cases. In the best cast you only need to implement 2 functions.

The integration in hashcat is very easy. Your module is compiled to a .so shared object on Linux (or .dll on Windows and .dylib on macOS). The moment when hashcat starts, it loads the shared object corresponding to the hashmode the user specified by the -m option (default is -m 0).

The path in which you have to store your module is src/modules/module_XXXXX.c. From there the module is compiled as a shared object into the folder $(SHARED_FOLDER)/modules/module_XXXXX.[so|dll|dylib]. The XXXXX is the hash-mode number (with leading zeros). There is no need to adjust any hashcat core sources. The makefile src/Makefile automatically finds the module you added and compiles it with the necessary flags.

If you run hashcat under linux or macOS without the make install target from the current working directory, then $SHARED_FOLDER typically equals the current working directory. On Windows it is always the current working directory because there is no install target in the makefile. A modification of the Makefile is probably only necessary in exceptional cases, i.e. if your module requires an external library. In this case and if you want to contribute the plugin to upstream, then we have to coordinate the development. Please contact us directly in such cases.

There is no need to implement any black magic into the module. A module covers exactly what you would expect from plugin, which is this:

Namely, these configurations take place in a variety of optional functions that you provide in every module. There are a few mandatory functions which need to be implemented:

At first glance, this looks a bit overwhelming. But in fact, all of these functions (apart from the first three) are only configuration parameters. You may wonder why we have not designed the module by simply setting a macro/number item for each of the configurations. But we wanted to give you the opportunity to change this configuration at runtime and not just at compile time. For example, if you only want to use a specific optimizer but only if the user runs the kernel on a GPU and not if the user runs it on a CPU. But this actually goes into deep detail. In reality, it usually looks like this:

static const char *HASH_NAME = "MD5";
const char *module_hash_name (...) { return HASH_NAME; }
module_ctx->module_hash_name = module_hash_name;

As you can see here, there is a so-called module_ctx object. Here you register all functions that you have implemented in your module and which should be used by hashcat. A list of all functions and their prototypes can be found under include/modules.h.

Hashcat will automatically call the module_init() function when it loads your module. In this function, you simply register all the functions that you have programmed by assigning it to module_ctx.


module_ctx->module_hash_name = module_hash_name;

For all functions that you do not use, please use the macro MODULE_DEFAULT. Using this macro, hashcat can see that its module_ctx_t structure is in the correct version (if you only want to distribute a binary). For instance, if a new function is added in a future version, the structure in the binary-distributed older version is one address too short and contains the value NULL. With this approach, hashcat can ensure that you work with your compatible module_ctx_t structure.

The only two mandatory functions that you normally have to program for a minimal plugin integration, are the decoder function module_hash_decode() and the encoder function module_hash_encode(). The other remaining mandatory function which typically only consists of static configuration items, but not code. Here is each of them explained:


There are only two different types that have already been discussed in the "Before the code" chapter. Here you determine whether your kernel is slow or fast hash.

The naming goes back to how the password candidate generator was implemented in the past. If the hash is a slow hash, reading a password candidate from GPU memory does not have any relevant impact on the performance. Therefore we can execute the password candidate generator in a standalone kernel and call it beforehand (there are actually three: STRAIGHT, COMBINATOR, MASK). The kernel then writes the resulting candidates to GPU memory. After that, hashcat calls the hash-type specific _init kernel function which loads the candidates from GPU memory.

If the crypto of the kernel is very fast, reading from GPU memory would create a bottleneck. For fast hashes we need a different approach. We load a "base" passwords from GPU memory to GPU registers, but only at the start of the kernel, in the "outer" loop. Then we enter an inner loop inside the kernel in which we iterate through a (limited) number of modifications and apply them to the base word. Note: this is what we call the kernel-loops. We can modify the candidate on a register level which keeps the memory access very low. In hashcat we have three base attack modes (STRAIGHT, COMBINATOR, MASK), for each of which we need to implement a specific kernel. The only difference is how we apply the modification on the base password candidate.

I will explain more about the details in the kernel section, but this already explains why writing a slow hash kernel is much easier. It is just one kernel, not three.



module_dgst_pos0() - module_dgst_pos3()

Before we can understand exactly why this is here, we must briefly note the following:

Cracking passwords is about time. That means depending on the speed of your kernel, iteration count, etc. you will never be able to try out a certain amount of password candidates. For instance, take NTLM. This hash mode is calculated on a high-end GPU like a 2080Ti with approx. 100GH/s. If we assume that you have 8 of these GPUs in a node and if we assume that you have 1,000,000 nodes, then you will still need roughly 6,500 billion years to try out 2^127 password candidates. What I want to say with this: there is an upper limit of password candidates we can search through because we are always limited in time, no matter how much money we spent on hardware. Or from a different angle: It does not matter how many bits a hash actually outputs because what actually prevents guaranteed cracking is not its output bit size but only time to go through a specific keyspace. At this time of writing I assume with a multi million dollar budget you can search through a maximum of 56 bits per second. If we assume a runtime of 10 years we can cover another 30 bits. So it is safe to say we can never search through a keyspace which is larger than roughly 90 bits. Therefore it is safe to store only 128 bits of it in a lookup database in hashcat. That means even for SHA512 with 512 bits, we are actually only interested in 128 bits of it. That is great, so we only need to test for 128 bits instead of 512 bits and save some clocks.

That means that you have to "select" these 128 bits. This makes particular sense because a cryptographic algorithm can never calculate all of its bits at the same time in its implementation. For instance, in MD5 (128 bit) the first 32 bits are calculated first, but then followed by the last 32 bits, then the penultimate 32 bits, etc. Exactly this effect which must exist in all hash algorithms is exploited by the hashcat optimizers. After the first 32 bits are known they are checked against the database if they do not exist, no further steps of the crypto algorithm need to be calculated. This order is given in an index of 32 bit integers. That means for MD5 the first check index is 0 and the second index is not 1 but 3, etc.

To find the right values for you, just take a deep look into your crypto algorithm and figure out which parts (in 32 bit blocks) of the output hash is finished first, which one finishes next, and so on.


static const u32 DGST_POS0 = 0;
static const u32 DGST_POS1 = 3;
static const u32 DGST_POS2 = 2;
static const u32 DGST_POS3 = 1;

For slow hashes, you will normally set the "order" to 0, 1, 2 and 3, because such optimizations are of little importance here.


As you can see in the previous section, only 128 bits of the target hash are tested, but the hash must be saved entirely so that it can later be converted from its binary form back to its original form. Only you can know the size of the hash you are storing, but hashcat needs this information so that it can allocate necessary memory buffers.

Important: Hashcat will provide this buffer of the size you specified in a void buffer which you will use as *digest_buf in the decoder/encoder.

Some macros already exist because they have already been used in many modules before. A list of the well known digest sizes already defined can be found in include/types.h.


static const u32 DGST_SIZE = DGST_SIZE_4_4;

This scheme is a bit cryptic. It goes back to the following logic: We store 32 bit values (each 4 byte) and we have 4 of them, in that order. So the macro results in 16 bytes (128 bits).

Tip: If you have a real hash as a target, this is relatively self-explanatory. But if you are not using a real target hash, but for example some encrypted data, then it makes more sense to save this data in an "esalt" struct. From there only save the encrypted data (for example the first 16 bytes) as a digest replacement. Encrypted text has a sufficiently high entropy to provide the uniqueness that hashcat expects in the digest buffer. An "esalt" is described at the end of this document because there is two different types of salt structures used in hashcat code.


This configuration has no influence on the process in hashcat, but only serves for documentation. The only time that hashcat is currently using this information is when it is called with --help. Here the modes are first sorted by category. A list of the categories already defined can be found in include/types.h.



If you want to add a category/type, that is not a problem. But since a change like this would change hashcat's core source, you should use a separate pull request (PR) in GIT.


This configuration has no influence on the process in hashcat, but only serves for documentation. It is a simple string that you can name as you like. It is then displayed in the status view or in the --help menu.

Tip: Try to limit the length to a maximum of 48 characters, otherwise it may exceed the maximum column size in the --help menu.


static const char *HASH_NAME = "MD5";


Here you specify the kernel hash mode number that this module should load. A kernel is always located under OpenCL/mXXXXX_a[0|1|3]-[optimized|pure].cl. This configuration is possible because of the feature to use a kernel from different modules. A good example of this is PBKDF2-HMAC-SHA512 which is used for both GRUB2 and macOS 10.8.

In theory, one could imagine implementing all such hash modes in a huge single kernel in this way, but such a kernel would become very inefficient due to the number of branches. Again, this is a trade off from readability/maintainability vs. performance. Since a password cracker is a product in which high performance is one of the most important properties, it usually ends up in a dedicated kernel for each hash mode. A look at the OpenCL/ folder will confirm that.


static const u64 KERN_TYPE = 7100;


This configuration item is a bitmask field. There are a few switches which you can enable and disable. But be careful, some of them have the potential to break your plugin. I recommend being very cautious using these flags. As always, the list of flags can be found here: include/types.h. I will comment the ones which exist right now:


The decoder function is the function that is called again and again for every line in your hashfile. We also call this sometimes the hash parser. Here you have to program the logic which decodes the line into its components and then stores them in the standardized data structure which hashcat understands.

Typically hash files are text files in which each hash is stored in a single line. Before the decoder function is called, Hashcat opens the hash file and scans the number of lines. Based on this information, it pre-allocates memory buffers for the hash digest, the salt and the esalt so you do not need to allocate any buffers from inside the decoder. If you allocate buffers from inside the decoder, you must free them as well. The size of the digest is based on what is returned from module_dgst_size(). The salt_t is a fixed size structure and the esalt size is known from module_esalt_size(). There is also some more rarely used buffers like module_hook_salt_size() but the logic is always the same. Hashcat simply multiplies the size of all these different structures by the number of lines. It then rewinds the file handle and starts iterating. For each iteration of these input lines, the module_hash_decode() function is called. The input pointer points to the new hash line and the output pointers point to the corresponding previously allocated buffers. You can directly access the pointers to store the digest, salt, esalt and other buffers without any offsets.

In hashcat there are two different types of salt structures. It is essential to understand them; please read the section "About salts" at the end of this document first. If you are unaware about the different concepts of salt_t and esalt, you really need to read that section before you continue this section.

For instance, if your crypto algorithm is something like MD5(MD5(pass.salt)), then you can expect to find both a hash and a salt in each of your hash lines. In the decoder function, it is up to you to split these two parts (typically by using the tokenizer - please read the tokenizer section below) and copy them into a standardized hashcat structure.

You are also responsible for checking the boundaries of all the input data. The tokenizer helps you with some very specific validation routines beforehand, but some very specific tests can be done only by you. If there is an error, you can simply return from this function by setting a specific error code, such as PARSER_HASH_VALUE or some other descriptive output. As always, a list of available error codes can be found in the include/types.h header. If everything works okay, you need to return this function with PARSER_OK.

If you are working with salts, you need to guarantee you have set the salt_buf[] array and salt_len value. If you are writing a slow kernel, you need to guarantee you have set the salt_iter value. Please read the "About salts" section if you do not know what these variables are.

In addition to the input line and the output buffers, there are some extra buffers available for you in the decoder functions. For instance, the *hashconfig structure. Sometimes you need to execute different branches of code in the decoder, based on the user options being set. For instance, if the user sets the -O option on the command line, you can detect this from inside the decoder by checking the OPTI_TYPE_OPTIMIZED_KERNEL in hashconfig->opti_type. A good example for this is src/modules/module_00000.c which exploits the fact that you can reverse the Merkle-Damgard construction in optimized kernels since it is guaranteed that the password is never longer than 55 characters.

A typical decoder function covers the following actions (not all of them always apply):


The opposite is the case with the encoder function. It is only called up as soon as hashcat has to provide the user with the hash in its original hash form. For instance, if a hash is cracked or in the status display (with single hashes). Typically you will find a number of snprintf() statements here, but for more details, there is an extra section below.

Important: Keep in mind the input buffer you get access to in the encoders are probably reused again in a later point of time. If you need to modify values before printing it to the user, make sure to not write into the original buffers. Instead, create a local buffer, copy the data to that buffer, and modify it in there.

The final hash should go to line_buf[] array and the length of the data in this array is the return value of the function itself.

Note: the general rule is that the kernel code should not do any unnecessary repetition of data manipulation (e.g byte swaps etc) because it should run as fast as possible. Instead, the encoder and decoder are host functions that are normally only executed very rarely - and therefore it is not a problem if they need to change the data a little bit to pre-compute, or adapt the data to make it look nice in the output.


This module can be used when you have certain configuration items of a hash that you want to override with a command line parameter. A good example is the --hccapx-message-pair, where the user can add additional filter criteria so that Hashcat doesn't load a specific set of hashes from the hash list.


This configuration item is a bitmask field and is very similar to the module_opti_type() function. The main difference is that here you configure general options of the workflow and not optimization specific settings. As always, the list of flags can be found here: include/types.h. The following list contains the flags currently supported:


This option tells hashcat that your hash is salted. Not all hashes are salted (mainly raw hashes have no salt). If they are salted, hashcat needs to know which strategy to use for storing the hashes. Here are the possible options:


Here you provide a hash for the self-test functionality. You also need to provide the correct password for this hash later. Please only use artificial hashes which you generated yourself. Typically this is a hash with the password "hashcat" which you have generated by using test.pl in passthrough mode.

Note that this hash will also be used as reference for the benchmark mode. In some rare circumstances to have a not too long running iteration count to reduce startup time delays. A good example for this is iTunes 10+ src/modules/module_14800.c. This can also be done using test.pl.

The --example-hashes command line argument together with a specific hash mode (-m) will also instruct hashcat to show the example hash and example password.


This is the password to crack the hash given in module_st_hash() for the self-test functionality.

module_hook_extra_param_init() and module_hook_extra_param_term()

These two functions were added to hashcat beginning with version 6.2.0 when it was required for a module to use a 3rd party library from inside a hook function. However, the module developer is free to decide how to use this buffer (it doesn't have to be a library handle). Generally, it is a buffer that the module developer would use for something that they need to initialize and terminate only once on startup and shutdown for performance reasons.

It is unclear to hashcat if this buffer will be handled in a thread-safe fashion or not, but since hooks are multi-threaded, hashcat will allocate multiple buffers (as many as there are hook threads) of this type for the module developer. This enables the module developer to load 3rd party libraries where they have no control over and which are known to not be thread-safe.

The module developer does not have to care about managing the multiple instances but has to provide the size of the buffer to be allocated. For this, they have to use the module_hook_extra_param_size() function. The buffer in *hook_extra_param is zeroed and ready to be written when module_hook_extra_param_init() is called. On startup, hashcat will call module_hook_extra_param_init() that many times as there are hook threads each time providing the module function with a new buffer. The same logic applies to module_hook_extra_param_term() on shutdown. Hashcat will also free the memory on shutdown.

A good example for this is: src/modules/module_11600.c and src/modules/module_23800.c


This is a mask that is hash mode specific and is normally used whenever the mask that should be used in benchmarks is of a very specific length and/or pattern. This could be a dynamic string, but in general try to use a variable called BENCHMARK_MASK.


This is a custom charset setting (like --custom-charset1) that always needs to be synchronized with the default benchmark mask (see module_benchmark_mask (), or the default one if not set). This setting does only affect --benchmark and you can use this charset within your mask with ?1. Try to use a variable called BENCHMARK_CHARSET for this string whenever possible.


This is a salt that is hash mode specific and allows you to set a salt_t structure dynamically. This could sometimes be needed if custom iteration counts or salt lengths are needed for a specific hash mode.


This is the second necessary ingredient for creating a plugin. Particular attention should be paid to the development of the kernel. Compiling the kernel takes a relatively long time, so both hashcat and the various compute APIs try to save a binary kernel in a cached structure. This serves to reduce the startup time and it is important for the user experience (UX). This however can be a pain as a developer.

NOTE: You -must- manually delete all cached kernels with every change to your kernel code. This is -very- important!

$ rm -rf kernels/

Note: make clean will automatically remove all cached kernels, but the recompilation with make of the whole hashcat binary will of course take much longer and this is therefore not recommended.

Keep in mind that a GPU is a multi-core device; hashcat will always try to utilize the parallelization power of the hardware as much as possible. This can be undesirable behavior while you are developing a kernel. Especially when you start using printf() - your console can get flooded easily with debugging information because hundreds of work items will be executed. Keep in mind that the printf() will be called for every workitem.

Additionally, there are a couple of kernel invocations which are unwanted when developing a kernel. They go back to the self-test functionality and the autotune engine. Both features are important for user experience. Keep in mind that the input password for these kernels invocation are not based on the password candidate you expect. It is therefore recommended to disable these features while developing the kernel.

To enable some of the special developing functionalities - for example, to disable the autotune - you need to unlock these undocumented features first. The first step is to enable debugging in src/Makefile by setting DEBUG to 1. Run a make clean afterwards.

Typically you want to develop the kernel with the least amount of unwanted side effects and we should invest in some proper preparation before actually starting writing code. A good example for this is the hash which you are using. Hashcat supports the hash given at the command line, but the command line can create unwanted side effects. For instance, the $ character could be part of the hash and you forgot to quote it correctly. The safer way is to write the hash and the password into separate files, as this will generally avoid any problems with interpretation from the shell. Of course this is not required, but it is the mindset which I am trying to emphasize.

Additionally there is a couple of command line parameters that you want to use:

Typically a developer command line for hashcat looks the following:

$ rm -rf kernels $HOME/.nv; ./hashcat -m XXXXX hash.txt word.txt --potfile-disable --self-test-disable -n 1 -u 1 -T 1 --quiet --backend-vector-width 1 -d 1 --force

When adding print statements keep in mind that you need to manually add a conditional to branch on a specific loop position, otherwise every parallel execution of the kernel will execute the printf(), flooding your terminal. So you can use either:

if ((loop_pos + i) == 0) printf ("%08x\n", a);

from a _loop kernel, or

if ((gid == 0) && (lid == 0)) printf ("%08x\n", a);

from a kernel without _loop.

Some last recommendations about printf() itself. Printing a string %s is not recommended. Missing zero bytes or big endian byte order can be very confusing. Instead try to use only the %08x template for everything. Especially for strings this makes a lot of sense, if for example you want to find unexpected non zero bytes. This can be done by calling printf() multiple times. Get used to this and it will simplify a lot of things for you.

To decide which type of kernel you want to write (pure or optimized), here are some recommendations when to write an optimized kernel implementation:

These recommendations apply for both fast and slow hashes.

In most cases, however, the code for the hash is exactly the same. In these cases you probably want to only implement a pure kernel.

Kernel parameters

Hashcat will call all kernels with exactly the same parameters. In most cases only a few of the parameters are used, but on the other hand they do not have a negative impact on the performance. Having a fixed prototype for all kernels makes it easier to work with all the different buffers and generally makes it easier to read kernels from other people.

There is no need for you to change anything. This section is only for information. While it is not relevant for you to know all the different parameters, at least some of them are important to know. Never write to any of them directly unless you know about the implications. Use the macros provided if you want to write something. Most of the buffer you do not even need to read from. The ones that are interesting for reading I will mark in the description.

The large number of kernel parameters can be confusing when writing a kernel. But since they never change, we can easily replace them with a macro. There is a couple of kernel parameter replacement macros where you need to choose one from for your kernel:

Kernel: fast hash type

The fast hash type is needed if we are cracking a hash that is so fast to compute that the PCI express bottleneck is taking more time than to compute the hash. These raw hashes are designed to compute very fast intentionally. They typically consist of only binary or arithmetic operations either with none or limited memory access. That means they often can be implemented on register level. On the other hand, if we need to access any memory structures just to provide the password candidates, it will hurt the performance significantly. Therefore the general concept of a fast hash kernel is to load a base password candidate directly onto a register and run a for() loop within the kernel which modifies the base password candidate.

The modification is depending on the attack-mode. Hashcat supports 5 different attack-modes with the -a command line flag (0, 1, 3, 6 and 7) but attack-mode 6 and attack-mode 7 share the same kernel code with attack-mode 1. This means we have to implement three kernels. These kernels are implemented in three kernel source files (0, 1, 3). Based on the attack-mode selected by the user on startup, hashcat will load the corresponding kernel.

The file name convention for fast hashes is: OpenCL/mXXXXX_a[0|1|3]-[pure|optimized].cl

Kernel: fast hash type (optimized)

As you can see from this convention, you actually have to implement six kernels if you want to add a full featured fast hash mode to hashcat. It is up to you if you want to save some time only implementing a pure kernel, only an optimized kernel or both. But in each case you must implement all three attack modes to support all the different attack types supported by hashcat.

Remember we only need to have those three different implementations due to the different ways the password candidate is generated. You may think it would be easier to have like three branches but these branches would already decrease the performance drastically.

Each fast hash kernel source in optimized mode has to provide the following kernel functions with this convention: mXXXXX_[m|s][04|08|16].

As always, the XXXXX is the hash mode with leading zeros. The m or s defines the multi-hash and single-hash implementation. In single hashes, often we can store the target hash on the register which makes the final test much faster compared to checking it on GPU memory. The m and s therefore often look almost the same. The only difference is that in the s kernel at some point you will store the target hash in a register. The final comparison function macro for m is COMPARE_M_SIMD() and for s is COMPARE_S_SIMD(). For single-hash this will add code to do on-register comparison. For multi-hash this will add the code to run the bloom filter and a binary tree search. For both cases, the macros expect you to provide 4 times 32 bit values in the same order as you have configured in the module functions module_dgst_pos0() - module_dgst_pos3(). Note that it always has to be 4 times 32 bit values, also for hashes which provide much more or much less bits output size. See the sections about module_dgst_pos0() - module_dgst_pos3() for details.

The [04|08|16] from the function name denotes the maximum password candidate length in 4-byte words. The 04 function limits the candidates to 16 bytes, 08 limits to 32 bytes, and 16 to 64 bytes. Only the brute-force a3 set of kernels need to implement all of the 04, 08, and 16 length variants. The a0 and a1 sets of kernels should only implement the 04 function and create the 08 and 16 variants as empty stubs. The candidate generation for the fast a0 and a1 kernels are proportionally slow enough that the length optimizations of the 08 and 16 variants simply don't provide significant benefit over the "pure" variant.

There are some kernels where using vector data types are beneficial even if they are executed on compute devices which have no native support for vector data types. A good example is NTLM running on a high end GPU. The performance gain comes from how the algorithm works and that there is in total 60 instructions that can be precomputed based only on the scalar base password. The base password never changes. The vectorization is done only in the inner loop, but from there it can access the precomputed (scalar) values from the outer loop. It saves both, instructions and resources. This is done automatically by the compiler because the structure of hashcat kernels allows the compiler to optimize it.

Kernel: fast hash type (pure)

The main purpose of pure kernels is to support long passwords (and salts) up to length 256. However, pure kernels are much easier to write than optimized kernels. First, it is only a single-hash and a multi-hash kernel to code. Second, it is expected you use the hashcat crypto libraries, for instance OpenCL/inc_hash_sha256.cl. To use the hashcat crypto libraries requires some detailed knowledge, please check the section on the hashcat crypto library below.

Each fast hash kernel source in pure mode has to provide the following kernel functions with this convention: mXXXXX_[mxx|sxx].

The pure kernels are supposed to run slower than optimized kernels, but it is hard to define a percentage which shows the performance difference because it largely depends on what kind of optimization you can use. For instance, for NTLM in which you can do meet-in-the-middle tests, the optimized kernel is around three times faster than the pure kernel. On contrary, for SHA256-HMAC they have exactly the same performance.

Kernel: slow hash type

Do not get the word "slow" in "slow kernel" wrong. This only means that the expected speed is so slow (or better said, the algorithm is so demanding) that the PCI Express Bottleneck is no longer relevant.

The slow hash kernel also supports pure and optimized kernel implementations.

In most cases you will develop only pure kernel implementations. An optimized slow hash implementation makes sense only if the _loop kernel uses parts of data (like the password or a salt) in its original form. Then you can do password length based optimizations. A good example is OpenCL/m00500-optimized.cl. However, these kernels are rare and therefore I will only describe the pure kernel implementation. There is also a specific kernel that I recommend looking at, OpenCL/m00500-pure.cl, you can use it for comparison.

As already mentioned, most slow hash modes do not use password length specific kernels like in a fast hash kernel. There is no "s04", "s08" kernels or anything like this. Dedicated single and multi-hash kernels also do not exist in this case, because it wouldn't make any sense or performance difference.

There are three kernels you need to implement. This means that you need to split the algorithm into a part which is done for initialization, a part which does the iterations (typically the part of the code which makes it slow) and final part where you do some comparisons or tests to see if the derived key matches. These kernels are:

It is obvious, but the kernels are executed in exactly this order: mXXXXX_init, mXXXXX_loop(N), mXXXXX_comp.

Along with the three kernels goes a context buffer called tmps which is accessible for read and write by all three kernels. The data type of this buffer is a void* and you cast it to a structure you need from inside the kernel. Hashcat knows about the size of the buffer because you returned it in the module in the module_tmp_size() method. This buffer is unique for every work item executed on the compute device. This means that hashcat will allocate a buffer on the compute device which has the size of your structure multiplied with the maximum possible work items which was discovered by the auto tuner. Each password candidate has its own tmps buffer allocated. The buffer is thread safe and free to be read or written to. You do not need to care about race conditions, mutexes, etc.

Typically it goes like this:

For slow hashes it is recommended to use vector data types if your algorithm allows to do so. If you use vector data types, use it in the _loop kernel only. Make sure to inform hashcat about using the appropriate opts_type option (see modules section).

Hashcat Crypto Library

The hashcat crypto library interface is very close to the OpenSSL interface with the typical Init(), Update() and Final() calls. But there are some important differences:

Working with the hashcat crypto libraries is straightforward. There are however some limitations you need to know about and you need to align with. The functions are designed to make it more easy for you to develop kernels, but they are written with performance in mind. This is achieved by using different optimization techniques. For instance, a crypto library cannot know how much data the user will provide. It therefore has to keep some buffers in the context to maintain some offsets. Each update() typically changes the buffer values and the offset. Typically you would code this by using some pointers. But pointers are poison for high performance. The computation of the address requires at least a temporary register, one or more mul() and another add() instruction call. This can be avoided. To do so, most of the code is using large switch() statements to enable the kernel compiler to translate a lot of code directly to register without the need to use an address to access a certain value in an array. But this goes too deep. Check OpenCL/inc_common.cl if you want to know more about the details.

The most important limits are the following:

Note that the type of buffer which holds the data is relevant, too. There are specific functions for working with local memory arrays and global memory arrays.

Often there is also a function which does the byte swapping for you. For instance, there is not only sha1_update() but there is also sha1_update_swap(). The prototypes are the same. There is also sha1_update_utf16le() and sha1_update_utf16le_swap(). I am sure you got the idea. If some helper function is missing, feel free to commit it to upstream but in a dedicated PR.

Since hashcat 6.0.0 it is also possible to use the hashcat crypto library from the host code. This is done by the emulation macros. To use a library, you just need to include emu_inc_hash_sha1.h or appropriate. Keep in mind that the limitations are exactly the same as if you use them from inside a kernel. A good example is src/wordlist.c or src/modules/module_12600.c.

About Salts

In hashcat, we have two different types of salt structures. There is a fixed size data type and a generic size data type. This goes back to how hashcat was created and how it evolved over time. However, it turned out the concept still works very good even with the most complex algorithms of today.


The salt_t is a fixed size data type which is defined in OpenCL/inc_types.cl and holds a number of configuration settings and buffers with different meanings. However, they all are using 32 bit integers exclusively. This goes back to the fact that GPU registers are always 32 bit. You can work with 8 bit integers, but will make the GPU slower because it has to emulate an 8 bit register behavior (which is done transparently from your perspective). We however are trying to avoid this by sticking to u32 data type buffers for your entire kernel to achieve best performance. I will now explain the components of the salt_t structure in detail:


Of course there are also generic buffers in case the data of your hash mode simply covers additional data like encrypted data, IV, etc. or simply salt buffers which are too long to fit into the standardized salt_t structure. To define your own struct, you need to define it in the module as well as in the kernel. Since both source codes are independent from each other, you need to maintain them and guarantee that they are synchronized. The esalt buffers and structs in the corresponding src/modules/ and OpenCL/ plugin files need to be the same and any change in one of these esalt structs in one of these source files would need to be accompanied by a change of the other file too. Other than that, it is a simple process. As described in the decoder section, hashcat needs to know the size of the structure so it can allocate enough memory space for it at the initialization phase. In order to inform the hashcat host binary of the esalt size, you must provide it via the function module_esalt_size(). It could be either a maximum size (upper limit) or a constant size. This depends on the algorithm. That is all. You can now cast the void *esalt_buf which is provided to you in the decoder and encoder functions to your esalt structure type. Note this address is maintained by hashcat. It guarantees a fresh buffer for each invocation of module_hash_decode(). Therefore, you can simply cast it for instance like this:

wpa_eapol_t *wpa_eapol = (wpa_eapol_t *) esalt_buf;

You can access the esalt from the encoder function and read the data the exact same way by casting it to your esalt struct.

Important: No matter if you are going to use an esalt or not, you always need to fill the salt_buf[] array and set the salt_len for it.

Important: In case you write a slow hash, you need to set the salt_iter element. Also never forget to implement module_esalt_size() if you use an esalt.

Data Structures: salt_t vs esalt

Hashcat has a generic data structure to handle "easy" salts (salt_t) and an open data structure one, that you can define yourself in case the generic data structure does not fit your needs (esalt). An "easy" salt is a single buffer salt of less than 256 byte (can be binary). The open data structure is called "esalt".

It is important to understand the difference between the generic salt_t struct and the esalt struct (which you create on your own). In fact it is essential to distinguish them. Based on the data we assign to the salt_t struct, hashcat sorts and groups all digests belonging to this particular salt_t first by the salt buffer content in salt->salt_buf[]. We can see it as a top level grouping. With the data we assign to the esalt, hashcat sorts and groups all hashes, but on a level below. Like in a SQL Statement when we do something like:

SELECT digest FROM hashes GROUP BY salt_t,esalt

But why is that? It is an optimization. If we have different data stored in salt_t and esalt, hashcat expects us to access salt_t data in the _init kernel and expects us to access the data of our esalt only from the _comp kernel. But that is an optional step and probably something of which you did not think of if you never wrote a hashcat plugin before.

A good example is the WPA mode. The crypto scheme in this mode requires multiple salt fields (IVs, mac addresses, encrypted data, etc). To derive the PMK master key (the slow part in the algorithms), only the ESSID (the network name) is required. If we want to crack WPA, typically we capture multiple handshakes, however all handshakes could be belonging to the same network. If we are clever we can exploit this weakness. In the parser we would set only the ESSID in the salt_t struct and all other data (IV, MAC addresses, etc.) go into the esalt. By doing so, hashcat will only spawn that many _init and _loop kernels as there are unique ESSIDs in the generic salt_t buffers. If we have captured 100 handshakes of the same network, hashcat only needs to run the compute intensive _loop kernel one time, not 100 times. But how to use the _comp kernel in this case?

Now let us talk about the three different types of _comp kernels. The first type is if we have an easy crypto algorithm which only contains a single salt buffer in the salt_t struct. Imagine we have ten real hashes but they all share the same salt. In such a case there is still no need to run the _init and the slow _loop kernel ten times. A single invocation of both is enough. In the _comp kernel hashcat will search a database if the digest exists in a database which is important. We only need to search this database for the existence of the hash, we do not need to decrypt something. This database is created automatically by hashcat at the very start. Every digest which we assign to the digest_t struct will be sorted and stored inside this database. Inside the _comp kernel, if we assign the final digest to r0 - r4 and call the #include COMPARE_M macro, the database gets searched. This code is highly optimized and is using a bloom filter and an additional binary tree search. So it can handle millions of hashes very efficiently. See OpenCL/m00500-pure.cl as an easy example.

Sometimes this is not enough. For instance, if we do not have a final digest which we could search for "existence" in the database. This happens if we have to decrypt some data and match the content of the decrypted data against some known pattern. It is obvious if we match data in this case we can not search the data for existence, right? In this case we actually need to iterate through all entries in the database. This is something very irregular from the general hashcat concept but there is a way to deal with it. Of course, if you only support single targets this is not a problem. The recommended way to deal with this is to verify if the salt_buf data you are using is unique. The goal is to force hashcat to call the _comp kernel as many times as it loads unique hashes from your hash list and iterates through all of them individually. If we choose to use this mode, it is essential for the salt_t buffer to be exactly as unique as the esalt buffer (the same number of entries). This could be achieved by using parts of encrypted data and copying it to salt->salt_buf[]. Hashcat will be forced to increment the digests_offset variable for each iteration which gives you the opportunity to index the different hashes individually. A good example for such a _comp kernel can be found in OpenCL/m14700-pure.cl.

There is even a third mode which is close to the second mode, but does not have the disadvantage of syncing the salt_t with the esalt, giving you the opportunity to exploit salt specific vulnerabilities in the algorithm (like in WPA). This mode can be activated by setting OPTS_TYPE_DEEP_COMP_KERNEL flag in the module. In this case hashcat will know that it has to call the _comp kernel for that many entries that are bound to a unique salt_t entry. So far there are only two algorithms which make use of that. This mode should therefore only be used in very rare cases and is discouraged if not applicable. See OpenCL/m22000-pure.cl as an example.

Most kernels today go for the second mode. However, if possible we should use the first mode because it is much more elegant. There is a trick to step down from the second mode to the first mode. In case we need to match some data after decryption but we know 100% of the data it is better to encrypt the known plaintext data instead of decrypting the encrypted data. In this case our final value can be searched in a database for "existence" and we can operate in first mode (i.e. make a "lookup").


The tokenizer is basically a CSV parser but with some special features. When it comes to loading hash files it is sometimes not so easy. The hash files often have been generated by extraction tools which do not follow CSV rules very carefully. Often the CSV is broken because of missing escape characters, quotation or other problems. Additionally, the tokenizer covers a preliminary data sanitization and data length check.

Of course we could have used a regular expression engine to do the same. But if you think of the potential multi million hash entries in hash files the tokenizer is much faster than a regular expression. Also the tokenizer offers configuration items that are known to be relevant when it comes to parsing hashes. The configuration of the tokenizer is very easy to read for third parties and gives an easy overview of how the hash lines are separated from a standardized perspective.

One very unique feature is that the tokenizer allows you to have both dynamic length columns and fixed length columns in the same hash line. This is sometimes the only way to read a hash line. Another unique feature is that it allows you to change the separator character for different columns in the same hash line. This is why you have to specify the separator character for each column separately.

The first step after declaring the tokenizer context buffer is to create its configuration. There is just one mandatory parameter and a maximum of 128 optional configuration items (columns). The mandatory configuration item needs to be set to the number of columns/fields which the hash line includes. Note that this is a fixed value. For more complex hash lines with a dynamic column count you need to create multiple tokenizer instances (e.g. use a second configuration, if the first one failed), but in most of the times this is not required.

hc_token_t token;

token.token_cnt = 1;

For a very simple, unsalted hash there is just one column: the hash. If this is a MD5 hash, it is typically encoded as a hex string of exact size 32 byte. These properties we can configure for the first column:

token.len_min[0] = 32;
token.len_max[0] = 32;
token.attr[0]    = TOKEN_ATTR_VERIFY_LENGTH
                 | TOKEN_ATTR_VERIFY_HEX;

The parameters len_min and len_max always define a valid range in bytes. Since it is always 32 byte, we simply set 32 to both parameters. With the configuration item TOKEN_ATTR_VERIFY_LENGTH we inform the tokenizer to verify the data length. If the length does not match, we will refuse the hash. The same goes for the configuration item TOKEN_ATTR_VERIFY_HEX. As you can imagine, this informs the tokenizer to verify if the data contains only hex characters (no matter the case). For more verification configuration items please see includes/types.h.

Finally the tokenizer is called. If any of the verification configuration items do not pass, the tokenizer will return a specific error code. As always, the error codes can be found in includes/types.h.

const int rc_tokenizer = input_tokenizer ((const u8 *) line_buf, line_len, &token);

if (rc_tokenizer != PARSER_OK) return (rc_tokenizer);

If everything went well up to this point, the tokenizer has placed the pointer addresses for the start to each of the columns in the token.buf[] array and the corresponding length in the token.len[] array. For instance, if there is only one column, the pointer address in token.buf[0] will be populated as well as token.len[0]. If you have two columns (token_cnt = 2), there will also be token.buf[1] and token.len[1], and so on.

As always, you can use the tokenizer configurations in the existing modules as a reference, especially for complex hash lines.

For hash lines with multiple columns, we need to use for each column either TOKEN_ATTR_FIXED_LENGTH (see below) or configure the separator character. The separator has to be a single byte character and is set using the "sep" parameter.

token.sep[0]     = ':';
token.len_min[0] = 32;
token.len_max[0] = 32;
token.attr[0]    = TOKEN_ATTR_VERIFY_LENGTH
                 | TOKEN_ATTR_VERIFY_HEX;

token.len_min[1] = 0;
token.len_max[1] = 32;
token.attr[1]    = TOKEN_ATTR_VERIFY_LENGTH;

In the above example you can see that we have hard-coded the separator character to ':'. In addition, this type of configuration the tokenizer will refuse the hash line if the separator was not found. You can also use the hashconfig->separator character if you want to use the separator character the hashcat user set using the -p command line option (default being ':').

There is one more configuration item which I want to describe: