Article covers IPv4 addresses which use 32 bits and similar technique can be used to IPv6 addresses which use 128 bits by using 2 BIGINTs to store (64bits each).
It is easy to handle reasonable load when one is querying for few IPs or few hundred IPs either from simple user interface entering one or more IP addresses or through APIs. In these cases IP addresses can simply be matched in string format without much of performance issue but it can become a performance bottleneck when dealing with millions of rows loaded through ETL process into Data Warehouse environment and reports are run against fact tables with hundreds of millions of rows. Under these circumstances converting IP to an integer will boost the performance substantially.
MySQL provides functions to convert IP string to integer and backwards. For example, inet_aton converts string to a number (integer) and inet_ntoa converts number to integer.
inet_aton and inet_ntoa:
mysql> select inet_aton('192.168.100.200'); +------------------------------+ | inet_aton('192.168.100.200') | +------------------------------+ | 3232261320 | +------------------------------+ mysql> select inet_ntoa(3232261320); +-----------------------+ | inet_ntoa(3232261320) | +-----------------------+ | 192.168.100.200 | +-----------------------+ 1 row in set (0.00 sec)
Bit operations:
The functions are simple to use and at the lower level they basically perform bit shifting. In the above example, '192.168.100.200' is represented as 11000000 10101000 01100100 11001000 in binary form. For more on binary system or to convert a number from decimal to binary see wiki here.
To convert string IP to integer function inet_aton multiplies 192 by 256*256*256, 168 by 258*256, 100 by 256 and then adding 200 (last octet in the above ip address), that is,
and it is same as 192<<24 i.e., left shift 192 by 24 bits, 168 << 16 and 100 << 8. Bit shifting is a much faster multiplication since CPU can use its register to move the bit in simple operation.
In reverse, one can not simply divide the ip integer (3232261320) to get back the ip address in octet string format. But one need to appropriately left and right shift the number (or do multiply and then divide) to get the octet that is of interest. To get the most highest valued position (192), shift the bits to right. For example,
11000000 10101000 01100100 11001000 >> 24 (right shift 24 bits) gives you
00000000 00000000 00000000 11000000
Therefore, all the right most 24 bits dropped off the register (cliff) and we are left with 4th octet 11000000 prefixed with 24 zeros.
mysql> SELECT 3232261320>>24 as 4th_octet; +-----------+ | 4th_octet | +-----------+ | 192 | +-----------+ 1 row in set (0.00 sec)
By the same logic to get the 3rd octet, we can move the bits 16 times to the right but we will be left with 4th octet as well. Doing
11000000 10101000 01100100 11001000 >>16 will result in
00000000 00000000 11000000 10101000
So we should first shift the bits to left so that the most highest octet (4th octet) drops off the register. We can do so by shifting left by 8 bits.
mysql> SELECT 3232261320<<8 as 8blsn; -- 8blsn: 8 bit left shift number +--------------+ | 8blsn | +--------------+ | 827458897920 | +--------------+ 1 row in set (0.00 sec)
Instead of getting a smaller number than before we got a larger number! And obviously, right shifting this number by 16bits to right will not give the 3rd octet of IP. This happens due to fact that MySQL converted int to bigint silently. The 4th octet moved into 5th octet resulting in appending 8 zeros to the right as below.
11000000 10101000 01100100 11001000 00000000
Since bigint is 8 bytes long one need to shift not by 1 byte (8bits) but 5 bytes (40 bits) to delete the highest octet (#192).
Just with 4 byte (32 bit) left shift the register would look like below.
11000000 10101000 01100100 11001000 00000000 00000000 00000000 00000000
And with 40 bit left shift the register would look like thus removing (#192 - 11000000 ),
10101000 01100100 11001000 00000000 00000000 00000000 00000000 00000000
Now right shift 56 bits to get the 3rd octet (#168).
00000000 00000000 00000000 00000000 00000000 00000000 00000000 10101000
This is same as
(3232261320 *256 *256 *256 *256 *256) / (256 *256 *256 *256 *256 *256 *256)
mysql> SELECT 3232261320><<40>>56; +--------------------+ | 3232261320<<40>>56 | +--------------------+ | 168 | +--------------------+ 1 row in set (0.00 sec)
To get the 2nd octet first left shift the original number 48 bits and then right shift by 56 bits. For the 1st octet left shift by 56 bits both ways.
mysql> SELECT 3232261320<<48>>56; +--------------------+ | 3232261320<<48>>56 | +--------------------+ | 100 | +--------------------+ 1 row in set (0.00 sec) mysql> SELECT 3232261320<<56>>56; +--------------------+ | 3232261320<<56>>56 | +--------------------+ | 200 | +--------------------+ 1 row in set (0.00 sec)
With all imperative/procedural languages providing the bit operators (AND, OR, bit shift, etc), one can and should convert the IP string to integer before being loaded to staging. Thus the parsing or conversion can also be easily parallel processed for large loads.
An example in C sharp (C#) is shown below.
// // C# program to convert IP address string to an integer. // using System; namespace IP_addr_to_int { class Program { static void Main(string[] args) { // Infinite loop until user enters control-c. for (; ; ) { string ip = ""; Console.Write("Enter ip address: "); ip = Console.ReadLine(); ulong ipNum = 0; string[] ipArr = ip.Split('.'); if (ipArr.Length != 4) { Console.WriteLine("Error. Enter four octet IP address only."); // Environment.Exit(-1); continue; } for (int ii = ipArr.Length - 1; ii >= 0; ii--) { ulong ipOctet = Convert.ToUInt64(ipArr[3 - ii]); // ipOctet * 256 * 256 * 256; if (ii == 3) { ipNum += ipOctet << 24; } // ipOctet * 256 * 256; else if (ii == 2) { ipNum += ipOctet << 16; } // ipOctet * 256; else if (ii == 1) { ipNum += ipOctet << 8; } // ipOctet * 1 else { ipNum += ipOctet } // System.Threading.Thread.Sleep(1000); } Console.WriteLine (Convert.ToString(ip)+ " converted to integer " + Convert.ToString(ipNum)); } } } }
For more ways to convert an IP see Part 2 and for tools see Part 3.
Cheers,
Shiva
No comments:
Post a Comment