A Service of Softnik Technologies

Data Extraction / String Tokens for Whois Configuration


Note: The string tokens are slightly different in the Mac OS X version, Please see whois parse table for configuring the Mac OS X version of Watch My Domains.


String Tokens / Data Extraction Tokens

All our domain name portfolio management software products use a highly customizable scheme for obtaining various data from whois records. This is done through what we call string tokens that identify key data elements.

Watch My Domains Whois Setup


A string token is simply a small phrase or word combination that appears just prior to the data that needs to be extracted. The software will look for the string token and then extract the text that follows the token.


...
Registrant Organization:  Softnik Technologies
...

In the above case the string token for the "Owner" field is

Registrant Organization:

Please note that the software has a large number of string tokens already loaded into it. You will need to specify the string token only if the software is unable to extract the relevant data.

There are times when simple string token settings will not work. Here are a couple of advanced string token examples.

@@

The @@ token is used extract data from a specific block within the whois record. Here is an example...

[Zone-C]@@Name:

German domain records have multiple blocks of data each with same text tokens. The above will allow you to extract the "Registrar Name" (look for the token "Name:" that follows "[Zone-C]" in the whois records).


...
[Tech-C]
Type: ROLE
Name: Domain Administrator
...
...

[Zone-C]
Type: ROLE
Name: HostEurope GmbH
...

If you specify the token as "Name:", the software will pick the first entry and set the registrar name to "Domain Administrator". Using the @@ token will ensure that the "Name:" entry after the "[Zone-C]" is picked.

#L#

Some whois records may always contain the required data at a specific line number. Use the #L#n token to pick the text at line n.

For example, #L#2 will pick the 2nd line.

The #L# token is mostly used along with @@ token.

The country code in the following occurs at the 6th line after the "Registrant Contact Details:".


...
 Registrant Contact Details:
    xxxxx
    John Doe
    xxxxx
    London
    London,xxxx
    GB
    Tel. +xx xxxx
...

So, use...

Registrant Contact Details:@@#L#6

The above will pick the 6th line after the occurrence of the token "Registrant Contact Details:" Please note that blank lines are ignored, so disregard any blank lines when entering the line number.

ML

The "ML" token is used to extract multiple entries and merge them into a single record. This is mainly used for addresses.

For example

#ML#registrant-street1:#registrant-street2:#registrant-pcode:#registrant-state:#registrant-city:#registrant-ccode:#

will pick entries corresponding to each of the tokens above and merge them into the address record. Remember to separate each token with a #.

{lf} and {rf}

You can use {lf} and / or {rf} to indicate linefeed (\n) and carriage return (\r). For example, if a whois record has something like


...
Renewal date:
 12-May-2015
...

Use the token

Renewal date:{lf}

to indicate that the expiry date is in the next line after "Renewal date:".

Most whois records contain only linefeeds and not carriage returns, so you will be almost always using {lf}. If {lf} doesn't work, you can try replacing it with {rf} or {rf}{lf}.

You can also use the {lf} token to ensure that the software picks the right token. For example, in some cases the token string may be present in multiple places in the whois record.


...
For complete Registrant details go to ...

...
...
Domain Name .... xxxx.xx
Registrant.... xxxxx xxxx
...

In the above whois snippet, the word "Registrant" occurs twice. You can use "{lf}Registrant" (without the quotes) to ensure that the correct one is picked for data extraction.