Whois Parse / Extraction Table Configuration
Configurable Data Extraction
Watch My Domains is designed to be highly user configurable. This allows the software to quickly adapt to format changes in whois data or to new TLDs that may be made available from time to time.
This article explains the whois extraction table used in Watch My Domains SED.
Changing Data Extraction Settings
You can access the 'Whois Setup' panel from the side toolbar.
Type in a TLD to retrieve it's current settings, and then click the 'Extraction' tab. Please see the screen-shot below.
Parse Table
The parse table consists of a set of entries that look like
token=>column
For example,
Last Update:=>last_update
... will look for a token called Last Update:
in the raw WHOIS text row and
extract whatever comes after it to the last_update
domain data column.
If the data occurs with a unique token, you will need to enter only the unique token. For example, if the registry WHOIS contains a line like
Registrant Organization:Softnik Technologies
... the whois parse table entry will be
Registrant Organization:=>Owner
If the data is in the next line, you can add a {nl} to the token. For example, if the whois data shows
.. .. Registrant: Softnik Technologies .. ..
The parse table entry will be
Registrant:{nl}=>Owner
Multiple Occurrences of a Token
If a token occurs multiple times, all of them are collected and extracted with a comma separating them.
You can use {n} (where n is a digit) in the WHOIS parser token to pick a specific index when there are multiple occurrences. For example,
organization:{2}=>organization
... will pick the 2nd occurrence of the organization:
entry in WHOIS output.
Multiple Entries with a heading Token
Sometimes there will be a heading token with a number of rows below it. For example...
... Name Servers: ns1.softnik.com ns2.softnik.com ns3.softnik.com ... ...
In such case use {ml}
in the token to indicate that multiple lines following the token should be extracted.
Name Servers:{ml}=>name_servers
Entries that appear in multiple blocks
Some times the same token will appear in multiple blocks. For example, the 'Owner' token can appear under 'Administrative Contact', 'Registrant' and 'Technical Contact'. If you only want the 'owner' under the 'Registrant' use...
Registrant@@Owner:=>registrant
Defining the Parse Table
When you make changes to the parse table it is assumed that you will provide all the new definitions for all columns.
This means that the application no longer use the internal parse table. You can use [*]=>*
as a first
entry to tell the parser to override this behavior. In most cases you want this. However, if you are going to
define all the extraction tokens manually don't insert that entry.
Here is an example.
[*]=>* Registrant Address:=>address Registrant:@@Name:=>owner Renewal date{nl}=>registrar_expiry Expiry date:=>expiry_date Name servers:{ml}=>name server
Parsing the Whois Data Again
You can use the 'Parse Whois' button on the side toolbar to re-parse the WHOIS after changes are made to the extraction table (without doing a new WHOIS lookup).