If you are not familiar with creating network sockets, this is roughly the networking equivalent of opening a file to read or write. What makes this instance unusual are the particular arguments supplied. For TCP and UDP sockets, the first argument is normally AFINET (address family Internet), and the second argument is normally SOCKDGRAM for UDP datagrams and SOCKSTREAM for TCP streams. The combination of AFPACKET and SOCKRAW tells the operating system to create a socket beneath the level of the entire TCP/IP stack.
Now is probably a good time to point out that in general, only the superuser (root) can do this. The third argument is the protocol field, converted to network byte order by htons (host-to-network-short). The only header needed for this part should be, though some systems may require for htons. Assuming all went well, s is now set to a socket descriptor (value 0).
IP Packet Processing. Transmission of a frame over Ethernet. Encapsulation: The Ethernet frame is completed, by inserting the Destination, Source and Ethernet Type fields. When Tags are used, the appropriate 802.1pQ Tag is inserted following the MAC header (the Priority field in the Tag may be.
If its value is less than zero, an error occurred, possibly due to permissions. Looking up interface properties The user has provided us with an interface name, maybe 'eth0'.
But there is no Linux command to send packets from eth0; we have to look up an internal number used to identify the interface, referred to as the interface index. Since the user did not supply us with a source MAC address, we will also look this up. Okay, this might be the least straightforward step, and I'm going to ask you to hit the 'I Believe' button a bit here.
Linux has an interface to many underlying system properties, called ioctl (Input-Output-Control). The inner workings of ioctl are by many considered to be dark and scary, but I will walk you through how to use it without dragging you through how it works.
Here is the code for looking up the interface index. Ifindex = buffer.ifrifindex; Briefly, ioctl is capable of querying properties of network interfaces and storing them in a buffer we supply as an argument. The include file provides a giant struct, ifreq, used for specifying these queries and storing the results.
We first need to (well, ought to) zero out the contents of our struct and load it with the friendly name of the interface ( memset and strncpy, respectively). The constant IFNAMSIZ defines the maximum length of friendly interface names for the operating system. We then invoke ioctl on an open socket (as far as I know, it doesn't matter what kind of open socket) with a constant specifying what property we want, and ioctl will load the struct with the corresponding result, which we store into ifindex. Likewise, we can look up the source MAC address this way.
ETHALEN); I've skipped a couple key details about how ioctl and ifreq structs work. Make sure you copy the results of each ioctl before invoking it again, as previous results may get overwritten. Like socket, ioctl returns and for more. Filling in the packet fields We're in the home stretch! Just some data structures to populate and it's time to fire this packet off! Let's go ahead and fill in the packet fields next. Memcpy(frame.field.data, data, datalen); We'll create an instance of the union we defined earlier, but initially treat it as the internal struct so we can set fields individually without worrying about typecasting.
Struct ethhdr defines the standard three-field Ethernet header, with an ' h' preceding the field names. Since the addresses are byte arrays, we use memset to copy them. Again, proto must be converted to network byte order (unless you did this earlier and stored it for later use.
But that would be WAY too efficient for our sensibilities, right?). As for the payload data, we'll assume that everything fits within the allocated space, but a quick check that datalen. Memcpy((void.)(saddrll.slladdr), (void.)dest, ETHALEN); Struct sockaddrll is defined in. It contains roughly half a dozen fields, but we only need to worry about four of them, of which two are gimmes.
Just to be safe, we start by zeroing it out. Sllfamily is simply a repetition that yes, this is in fact a packet socket. Sllhalen confirms that Ethernet addresses are ETHALEN (i.e., 6) bytes long. Fields sllifindex and slladdr specify the sending interface and destination address, respectively. (I'm not entirely clear why the destination address is both specified here and manually put into the Ethernet frame, but I suspect it is clear in the kernel code.) You'll notice the (void.) typecasting here; often these are not needed, but once in a while, depending on the type given to the struct fields, it gets rid of a compiler warning.
Sending the packet! You have successfully created a raw socket, looked up properties of a network interface, filled in data structures, and you're all set to fire off your very own, customized Ethernet frame. Only one line of code stands between you and victory! So, let's have it. (struct sockaddr.)&saddrll, sizeof(saddrll)); This is the same as a sendto for UDP or raw IP. Briefly, it takes as arguments the packet socket descriptor, the byte array containing the entire frame (headers included), length of the frame (datalen + ETHHLEN), flags (if any), a pointer to the sockaddrll struct, and the size of the struct.
It returns the number of bytes sent,. Anonymous Mike, I have a question. Let's say we're going to prepare and send an ethernet frame via raw socket and we only have for ethernet payload data the following: an IPv4 header (20 octets) and UDP header with no UDP payload data (8 octets). We start our ethernet frame with the destination and source MACs and protocol, and then add-in our IPv4 header and UDP header. With that, we have 42 octets (6 + 6 + 2 + 20 + 8), which is short of the minimum of 60 (ignoring the frame check sequence field (FCS) which is 4 octets).
Should we let the kernel pad the ethernet frame out to 60 or do it ourselves? The Linux kernel inserts the FCS and other junk anyway, but I wonder what would happen if we padded the frame out ourselves. Hi Dave, great questions! To my knowledge, padding is always done on your behalf when using packet sockets, as is the Ethernet preamble and FCS/CRC. I have confirmed this behavior on the system described in my writeup. It is not clear to me, however, whether in general padding is done by the kernel as you suggested or by the interface hardware.
This behavior can be observed by sending the frame you describe above and capturing that frame at both the sending interface and at the remote, link-local interface to which it is destined. Padding bytes will not show up in the local capture but will in the remote capture. Your second question is more subtle. The short answer is that it doesn't functionally matter whether you or the underlying system pads out the frame; the end result will be the same. Suppose you padded your 42-byte frame (using null bytes) to 60 bytes. Then the receiving side will see a frame of length 60 with your 42 bytes followed by your padding, with the FCS appended to the end.
Now suppose you don't add padding and let the underlying system do it for you. The receiving side will see precisely the same frame as in the first case: your 42 bytes followed by padding and finally the FCS. The interesting question is how the receiving side treats this padding. Unlike with C strings, null bytes in an Ethernet frame have no special meaning such as termination. This is necessary because legitimate frames may contain null bytes throughout their payloads. Further, Ethernet frame headers do not denote frame length. Thus, to the receiver there is no difference between a 42-byte frame with padding and a 60-byte frame that happens to contain null bytes in the payload.
It is therefore typically up to the higher layers to implement PDU length fields and checks as needed. (As an aside, I learned the hard way that the receiving process must account for padding when checking the received frame length against an expected length. If the expected length is less than the minimum frame length, then the check should compare the received length against that minimum instead but use the expected length for extracting the higher layer PDU bytes.) Notice that the FCS must therefore also contain the same value whether you or the underlying system perform the padding; otherwise it would be required to check the FCS against all prefixes of the received frame that do not truncate any non-null bytes. The only reason I can think of why you might want pad a frame yourself is if it saves the underlying system from performing an expensive operation, such as copying the frame into a larger buffer with room for padding. However, this is not likely to be the way padding is implemented since short frames occur frequently. One of these days I'll have to look at the kernel implementation of packet sockets and confirm this. If you do so or find a reference for this, please share!
Cheers, Mike. I should caveat that there is actually a second reason why one might pad a frame themselves, though I think it is specific to measurement and diagnostic applications. One might want to make explicit the exact string of bytes being emitted at the sender so as to obviate any implicit behavior of the underlying system. This would also mean that the sender and receiver could agree in advance (possibly out of band) on the exact string being transmitted. This might be useful when testing the handling of very specific frame contents, but less so in general applications. Anonymous Hi Mike, First I would thank you for the clear and valuable tutorial you made. When constructing the Raw packet, you assumed that the destination address is already known.
However, if one would make a complete implementation that fits all cases, one should use ARP. Actually, I want to do that but, as I am still beginner in Networking programmin, I don't know how to do it properly (what libraries to use, what fields to fill). I've already studied ARP protocol, but I have some points to check and I would be pleased if you could give me some guidelines. 1/What data structure to implement and what header files to include (as Ethernet payload would be an ARP request)? 2/Ethertype must be changed to ARP EtherType. 3/ When filling destination address field 'saddrll.slladdr' one must put the Broadcast MAC address. 4/ use 'sendto' to send the ARP request.
5/ Add a recvfrom function to get the ARP response before proceeding to Data sending (so, recvfrom must be a blocking call in this case). 6/ Fill Ethernet frame with the destination MAC address we got and send data.
7/ I assume that it is not necessary to use two sockets for each: ARP and Ethernet data (one socket suffices for both). These are my assumptions and questions, please could you correct me if I was wrong. Thank you very much, Marcello.
Hi Gabriel, implementing ARP from the ground up sounds like a fun project! I'll try to address your questions here, but I also recommend referencing Wright and Stevens' TCP/IP Illustrated Volume 2, which dedicates an entire chapter to ARP implementation. The headers you'll need should be the same as in my code above, presuming you don't want extra library support for constructing ARP messages. The ethertype for ARP will be 0x0806 (remember to put in network order using htons or hard code as 0x0608 if developing for a little-endian machine). The 'proto' field in socket specifies which ethertype is received by that packet socket; you can specify htons(ETHPALL) to receive all ethertypes, though you will then need to filter all other arriving Ethernet frames that you don't want.
It may be easier to use two sockets to simplify this. In the sockaddrll struct for an outgoing ARP request,.slladdr is indeed the broadcast MAC address. Other than that, you should only need to change.sllprotocol to the ARP ethertype. You will still use sendto to send the request. You can receive ARP replies using recvfrom as I described in a comment above.
From there, you can use the.slladdr or parse bytes 7-12 in the buffer you pass to recvfrom to get the destination address for your frame. Again, you can either use a single socket and filter all unintended frames you receive or use two sockets. One last consideration is how to de-conflict with the existing ARP implementation on your system. Since you're not trying to process incoming ARP requests, there won't be any conflicting or duplicate responses between your implementation and the system's. However, you may receive replies to ARP requests sent by the system, in which case you'll need to parse the full ARP reply to determine whether it answers your request.
Hope this helps, and I'd be excited to see what you develop! Anonymous Hi Mike, Thank you for your reply. I think that there are other aspects that should be taken into consideration rather than updating the Etherent header.
In order to implement an ARP request, the ARP header should be included into in Ethernet payload field before sending the frame. ARP contains 9 fields (Hard type, proto type, operation, Dest Mac, Dest IP.) and these should be specified.
Actually, that's why I was asking about what appropriate data structure to use in order to implement that. In truth, I want to make an embedded system be able to communicate with a host at Raw Ethernet level. That's why before starting to exchange data with the host, the embedded system must know what MAC address the host is actually using. Again thank you so much and Regards, Marcello. Anonymous Hi Mike, Thank you, a clear explanation.
how the device knows the IP addresses it needs to resolve using ARP. To solve this I imposed a specific IP address to be used on the host side. So, user have to configure the host IP address with the one I specified in order to be able to communicate with the device. how to process incoming ARP requests (generally necessary for other devices sending to you) That should be taken into consideration:) Also, as the system has to deal with only one host and it is not likely to have the host MAC address be changed, I suppose there is no need to implement an ARP cache. A character array of 6 bytes size would be sufficient to hold the MAC address of the host. Don't worry about mixing names:) Thanks and Regards, Marcello.
Hi Lucio, You're absolutely correct. Unlike with higher-layer sockets, the C string produced by recvfrom in this case is the full Ethernet frame, minus preamble and CRC (which I believe are removed prior to kernel processing).
Supposing the char pointer you use in recvfrom is s, the quick and dirty solution for printing would be something like: printf('%s', s+14); since the first 14 bytes are the Ethernet header. Assuming you aren't using any headers past that point, the remainder of the buffer is your payload. If you wanted to be more general with link-layer headers, try experimenting with the sllhatype field of the sockaddrll that gets populated by recvfrom. I suspect you can correspond each value (see linux/ifarp.h in your headers) with a header length. Hi Shreyas, thanks for the comments! Answering your first question: my code uses a C union (between the embedded struct containing the Ethernet header and payload data, and a char array containing the entire 'buffer.' I set the bytes of the Ethernet frame by setting frame.header and frame.data, and then refer to those same bytes (but as a different data type) using frame.buffer.
An alternative would be to not use a union, and instead define a struct that is recast as a char array when sending. Your code appears to work fine (assuming you #include the same files as in the sender, and that you run the code as root, i.e. I modified the sender code to set iface to 'lo' (the loopback interface) instead of 'eth0' and set dest (the destination Ethernet address) to all 0xff's, and was able to receive the 'hello world' message sent by my program using your code above. By the way, I also edited the printf call to show the contents of the payload (assuming it contains a human-readable string): printf(' nData rcvd (%d):%s n', datarcvd, framebuffer+sizeof(struct ethhdr)); Cheers!
I too witnessed “packet not going out” which had me going for some time. I was using an Asus laptop with its built-in NIC set up with six VLANs. All the VLANs were basically working in that I could ping devices on all different VLANs, and then look at the arp table and see the expected enteries against each sub-interface, but packets generated using this code were not going out – not according to tcpdump, at least.
The same software was working on another machine in the same version of Fedora, so I was pretty sure the software was correct. In the end, I added a second NIC (elderly 100Mbit/s Belkin, USB-attached), reconfigured my interfaces to run over that NIC and everything started working correctly. My tentative conclusion is that the drivers or the hardware of the built-in NIC were not handling RAW sockets correctly over sub-interfaces. The suspect NIC reports as “Tigon3 partno(BCM95764m) rev 5784100” in a dmesg report, and its MAC is interpreted as Wistron by Wireshark. Kernel version is “2.6.35.14-106”.