[ Devops : Back to Basics ] : The Data Link Layer
This article is part of a serie called [ Devops : Back to Basics ] where i document what i'm learning on my journey to become a Devops Engineer from my current position as a Backend Engineer.
Ethernet and MAC Addresses
The protocol most widely used to send data across individual links is known as Ethernet.
To avoid collision domain, Ethernet, as a protocol, is using CSMA/CD (carrier sense multiple access with collision detection). CSMA/CD is used to determine when the communications channels are clear and when the device is free to transmit data.
The way CSMA/CD works is actually pretty simple. If there's no data currently being transmitted on the network segment, a node will feel free to send data. If it turns out that two or more computers end up trying to send data at the same time, the computers detect this collision and stop sending data.
Each device involved with the collision then waits a random interval of time before trying to send data again. This random interval helps to prevent all the computers involved in the collision from colliding again the next time they try to transmit anything.
When a network segment is a collision domain, it means that all devices on that segment receive all communication across the entire segment. This means we need a way to identify which node the transmission was actually meant for.
This is where something known as MAC address ( media access control address) is used.
A MAC address is a globally unique identifier attached to an individual network interface. It's a 48-bit number normally represented by six groupings of two hexadecimal numbers.
A MAC address is split into two sections. The first three octets of a MAC address are known as the organizationally unique identifier or OUI. These are assigned to individual hardware manufacturers by the IEEE or the Institute of Electrical and Electronics Engineers.
The last three octets of MAC address can be assigned in any way that the manufacturer would like with the condition that they only assign each possible address once to keep all MAC addresses globally unique.
Ethernet uses MAC addresses to ensure that the data it sends has both an address for the machine that sent the transmission, as well as the one that the transmission was intended for. In this way, even on a network segment, acting as a single collision domain, each node on that network knows when traffic is intended for it.
Unicast, Multicast, and Broadcast
When one device transmit data to one other device is called unicast.
A unicast transmission is always meant for just one receiving address.
At the Ethernet level, this is done by looking at a special bit in the destination MAC address. If the least significant bit in the first octet of a destination address is set to zero, it means that Ethernet frame is intended for only the destination address.
This means it would be sent to all devices on the collision domain, but only actually received and processed by the intended destination.
If the least significant bit in the first octet of a destination address is set to one, it means you're dealing with a multicast frame.
A multicast frame is similarly set to all devices on the local network signal.
What's different is that it will be accepted or discarded by each device depending on criteria aside from their own hardware MAC address.
The third type of Ethernet transmission is known as broadcast.
An Ethernet broadcast is sent to every single device on a LAN. This is accomplished by using a special destination known as a broadcast address.
The Ethernet broadcast address is all Fs.
Ethernet broadcasts are used so that devices can learn more about each other.
A data packet is an all-encompassing term that represents any single set of binary data being sent across a network link.
The term data packet isn't tied to any specific layer or technology. It just represents a concept. One set of data being sent from point A to Point B.
Data packets at the Ethernet level are known as Ethernet frames.
An Ethernet frame is a highly structured collection of information presented in a specific order.
Almost all sections of an Ethernet frame are mandatory and most of them have a fixed size.
A preamble is 8 bytes or 64 bits long and can itself be split into two sections. The first seven bytes are a series of alternating ones and zeros. These act partially as a buffer between frames and can also be used by the network interfaces to synchronize internal clocks they use, to regulate the speed at which they send data.
This last byte in the preamble is known as the SFD or start frame delimiter. This signals to a receiving device that the preamble is over and that the actual frame contents will now follow.
Immediately following the start frame delimiter, comes the destination MAC address. This is the hardware address of the intended recipient.
Which is then followed by the source MAC address, or where the frame originated from.
NB: each MAC address is 48 bits or 6 bytes long.
Length / EtherType
The next part of an Ethernet frame is called the EtherType field. It's 16 bits long and used to describe the protocol of the contents of the frame.
A payload in networking terms is the actual data being transported, which is everything that isn't a header. The data payload of a traditional Ethernet frame can be anywhere from 46 to 1500 bytes long. This contains all of the data from higher layers such as the IP, transport and application layers that's actually being transmitted.
Following that data we have what's known as a frame check sequence. This is a 4-byte or 32-bit number that represents a checksum value for the entire frame. This checksum value is calculated by performing what's known as a cyclical redundancy check against the frame. A cyclical redundancy check or CRC, is an important concept for data integrity and is used all over computing, not just network transmissions. A CRC is basically a mathematical transformation that uses polynomial division to create a number that represents a larger set of data. Anytime you perform a CRC against a set of data, you should end up with the same checksum number.
The reason it's included in the Ethernet frame is so that the receiving network interface can infer if it received uncorrupted data. When a device gets ready to send an Internet frame, it collects all the information , like the destination and originating MAC addresses, the data payload and so on. Then it performs a CRC against that data and attaches the resulting checksum number as the frame check sequence at the end of the frame.
This data is then sent across a link and received at the other end.
Here, all the various fields of the Ethernet frame are collected and now the receiving side performs a CRC against that data.
If the checksum computed by the receiving end doesn't match the checksum in the frame check sequence field, the data is thrown out. This is because some amount of data must have been lost or corrupted during transmission.
That's it !
Don't hesitate to comment to point out some mistakes or make some precisions.
As i said in the beginning, i am still studying these concepts :)
See you in the next one ! 🤖🤖🤖
Did you find this article valuable?
Support Sonia Manoubi by becoming a sponsor. Any amount is appreciated!