Networking Concepts Overview

What is a network?  Ans: Computers (a.k.a. hosts) able to share information and resources.  A network may be small (one geographical location) or may cover the globe.  Different technologies are used in each case.  For a LAN we use technology such as Ethernet, which broadcasts packets so every station sees them.  For a WAN we use slower, error-prone serial links that connect one LAN to another (point-to-point connections).

Qu: What is needed physically?   Media such as a cable between each pair of computers (called a mesh), or in a ring, or in a bus, or in a star.  (Network topologies.)  (Show bus/star with 4 hosts.)

Computer cables can be tricky to work with.  They don’t work if too long, and they work poorly if kinked, if not terminated correctly, or if improperly grounded.  When attaching a connector to the end of a cable, if you straighten out 0.5in at the end more than you should, the 100Mb/sec cable will only support about 30Mb/sec!

Besides being more delicate then commonly supposed, there are safety issues with standard cables.  Such cables are clad in PVC, a strong, durable, flexible, and cheap insulator.  However, in a fire the cables can get hot and then they give off deadly chlorine gas!  In air spaces where people might be, you need to use more expensive (but safer) plenum cable.  There are various building and safety codes to consider as well.  In the end, you should consider using a licensed cable installer.

Qu: What would happen if two or more hosts transmit at the same time?  Ans: a collision.  To permit communications all parties must agree to a set of rules, or protocols.  Today the most common set of protocols for a LAN is called Ethernet.

In addition to media and protocols, you need some hardware to connect the host to the media.  Called a NIC (Network Interface Card).  In Linux these have names such as “eth#”, where “#”=0,1,2,...  In Solaris, they have strange names such as “elx#” depending on the manufacturer/chipset of the NIC, but many modern NICs are simply known as “hme#”.

Your operating system needs the correct driver software to allow applications to send messages to the NIC.  Then some application can send a message to another host by invoking the proper API function.  The data will be passed to the NIC and then sent on its way.

Qu: (point to diagram) if this host wants to send a message to that host, how does it do that?  Ans: each computer needs a unique address, so when one computer sends data to another, the intended recipient knows the data was meant for it.  The other computers on the network are supposed (!) to ignore the message.  In the old days, the administrator manually set each NIC with a unique number between 1 and 255.  Today, NICs come configured with an address already, known as the MAC Address (or BIA, data-link address, ...).  For Ethernet NICs, this address is 6 bytes (48 bits).  The first three bytes uniquely identify the manufacturer, assigned by the IEEE.  The last three bytes are a unique serial number.

For host A to send data to host B, host A must build a packet containing the data plus a packet header which contains the destination MAC address and other information (e.g., packet length, type of data, ...).  Besides the header and data, a checksum (FCS or frame check sequence) is appended to the end.  This header and checksum are sometimes called framing, and these packets are sometimes called frames:

Overview of communication:  Application invokes API function, passing it the address of the recipient and the data to send.  API function builds the packet from this info and sends the packet to the NIC.  The NIC sends out the packet onto the media, one bit at a time, according to the network protocol.  If the data is very large, the API function will split it into several packets, which get reassembled at the destination.  (Example: FTP a large file; Ethernet max packet size is 1522 bytes, including 22-bytes of header.)

Most NICs will examine only enough of the packet to see if it was intended for them or not.  If not, they stop looking at it.  The intended destination NIC will read in the whole packet, compute a checksum, and compare it with the checksum at the end of the packet.  If they don’t match, the packet is corrupted and must be sent again.

Internetworks

With media, NICs, protocols, addresses, and software, a network can function.  But there are problems: What if one host on your network wants to send a message (a packet) to a host on a different network?  Sending data between networks is called internetworking, and a collection of networks that are connected are called an internet.  (The “Internet”, with a capital “I”, refers to the global internet that connects nearly every network with every other.  Lately, the popular press has stopped capitalizing it for some reason.)

The obvious solution of connecting both networks into one large network doesn’t work.  The technology for LANs has strict size (number of hosts) and distance (meters not kilometers) limitations.

Instead a more complicated solution is used.  A device called a router (sometimes called a gateway) with two or more NICs is used to connect the LANs.

The sending host on the first LAN sends the packet to the router (to the NIC connected to its network).  The router then resends the packet out a different NIC that connects to a second LAN.  Now all hosts on that LAN, including the destination host, see the packet.

For this to work, every host must know the address of the router interface connected to its network.  And when the packet is received by the router, it must somehow determine through which NIC to send the packet out.  The router then needs a list of all addresses on all networks.  The situation is made worse since not all networks connect to the same router.  It is often necessary for the first router to forward the packet to another, and then another, etc., until the packet reaches the final network.

All hosts need to know the IP (internet protocol) address of the router.  This can be set with the old route command or the newer ip command on Linux.  This IP address can be stored in a file and used with the route command automatically when the network is brought up.  With Fedora, use the file /etc/default-route, and in Solaris use /etc/defaultrouter.  (This information is often stored in different files, even on Solaris and Linux!)

Actually, there are other ways for a host to send packets to a router, that don’t require knowing its address.  They are not commonly used however.

The most common internet protocol suite in use today for this is IPv4.  A newer version called IPv6 is available, as are competitors such as IPX.  The common name for the IP suite is TCP/IP.

Addressing

For one host to send a packet to another, it must know the address of the destination host.  This leads to a problem of finding out the addresses of all the hosts in the world.  Why not use MAC (Ethernet) address?  Huge unmanageable routing tables, administration problems (e.g., firewall configuration).

The most popular answer (there are others) is to assign an IP address to each NIC. IP addresses have two parts: network number and a host number.  The idea is that routers only need to keep track of the various networks in the world, not all the host numbers.  So, for one host to send a packet to another, it must have the IP address of the destination host.  If the network number is the same for both hosts, the packet is just broadcast on the LAN as normal.  If the destination is in a different network than the source, the sending host sends the packet to a router instead, trusting the router to forward the packet on its way.  Once a packet is delivered to the correct network, Ethernet (MAC) addresses are used to deliver the packet to the correct host, as discussed previously.

IPv4 addresses are 32 bits long.  (This isn’t a lot of addresses!  This changed to 128 bits for IPv6; see RFC-3513.)  They are most commonly written in dotted-decimal notation:  10.3.200.42.  The 32 bits are divided into two parts: the network number and the host number.  Each LAN must have a unique network number, assigned by your local ISP, who bought a block of them from a regional provider (ARIN), who in turn gets huge blocks of numbers from the IANA.  ISPs lease them out to you and me.

A single host may have several NICs (so do routers).  It is important to remember that it isn’t the host that has an address, it is the NIC.  So a host with two NICs has two addresses.  Also, a single NIC may have multiple addresses.  This is known as IP aliasing.  (e.g., eth0:0, eth0:1, ...)

The LANs still use Ethernet to sent packets locally.  But hosts on a network only know the MAC address of NICs on that network.  So how does the sending host lookup the MAC address of some other host, given only its IP address?  One way is to keep a file of IP to MAC addresses on each host, and to update it regularly.  For every host in the world.

A better way is to use ARP (address resolution protocol).  The ARP protocol is used to map IPv4 addresses to MAC addresses, so sending hosts do not need to know the MAC address, only the destination IP address.

Illustrate this protocol:  (1) source (local) host determines if destination (remote) host is on same network.  If so, then (2) broadcast ARP request for destination host MAC address.  If the destination host is on a different network, then broadcast an ARP request for the MAC address of the gateway (a router that connects one network to another).  (3) wait for ARP reply.  (4) now send packet to destination (or gateway).

Hosts maintain an ARP cache to save a lookup.  To view the ARP cache, use the command arp -an.

A (possibly) useful analogy: Suppose the teacher wants to ask a student a question.  If they are in the same room (same LAN, a.k.a. same data link), the teacher just waits for a quiet moment in the room, and says “you there, in row 2 seat 4, ...”.  This is what Ethernet does.  But this won’t work when the teacher is in a different room at the time.

Suppose the teacher is in their office (DTEC-404) and wants to ask the student “Hymie” in classroom DTEC-461 a question.  In that case, the teacher asks a lab tech (a “router”) to forward a message to a student “Hymie” in DTEC-461.  The lab tech doesn’t know which desk Hymie is at (Hymie’s MAC address), but it doesn’t matter; the lab tech simply goes to DTEC-461 and, in a quiet moment, shouts “Where is Hymie?”.  Hymie replies, “I’m in row 2 seat 4”.  The lab tech can now deliver the message to that student.  Delivering messages between LANs (or classrooms) is what TCP/IP does.  Analogy aside, when your computer sends a message to another, it must decide if the other computer is on the same network or not; if so just broadcast the message on the LAN; if not, send the message to a router (the “default gateway”) for delivery to another LAN.

RARP (which is related to BOOTP and DHCP protocols) does the reverse:  Given a host’s MAC address (which is all the host typically knows when it boots up) it asks a server for its IP address.  This is useful when you don’t wish to configure each and every host in your organization individually.  You can do this of course, by putting the host’s IP address, the gateway router IP address, and other information in configuration files the host can use at boot time.  But it is easier to have a single DHCP server on the LAN.  When the host boots up it broadcasts a DHCP request packet containing its MAC address.  The DHCP reply contains all the required network parameters.

Note your own computer has a virtual NIC with the address 127.0.0.1 (the loopback address, usually referred to by the name localhost).

Port numbers and Sockets

Sending a packet to a host isn’t enough.  When the destination host gets the packet, what program should it send it to?  (Web server?  Email server?  Telnet?)  Part of the layer 4 header includes a port number to identify which program should receive the packet and which one sent the packet.  These are 16-bit values.  (Example: a web browser with two windows open.  You click a line on one, switch to the other and click a different link.  Each browser window’s HTTP request packet will use a different source port number so the replies will be sent to the correct window.)

When a host receives a packet, the kernel will check the port number to see to which process to send it.

Show packets with Wireshark.

So how does a client (say a web browser) know which port number corresponds to a server?  The servers listen for a particular port number that all agree on (IANA).  The standard servers use well known Port numbers in the range 0–1023.  Which service (and its application level protocol) uses which port number is documented in the /etc/services file.  These Ports are reserved for public services such as FTP (20 and 21), telnet (23), SMTP (25), and HTTP (80), HTTPS (443).  This makes it easy for clients; to contact your web server the client will send the request packet to your IP address and destination port 80. Note that on a Unix system root-privileges are needed to listen in on a well-known Port.  (This prevents a user from crashing your web server and then starting their own, fooling people who visit your web site!) 

The range 1024-49151 are User (Registered) Ports, used for other public services (such as Unix rlogin or the w3c SSL services).  These are also registered by IANA (as a public service.)

The Dynamic and/or Private Ports are those from 49152 through 65535.  Clients will use any available port number higher than 1024; the kernel keeps track of which are in use.  (Note: you can use a telnet application to connect to any port: debugging.)

A socket is the combination of an IP address and a port number.  A pair of sockets will uniquely identify a network connection from a client application on one host to a server on another host.

Many servers are not started at boot time (ftp) although some are (httpd).  (Q: Why?).  Instead a “super-server” known as inetd or xinetd (or systemd on modern Linux systems) is started at boot time that listens for incoming packets with a variety of port numbers.  Inetd (or whatever) then checks its configuration file to determine which service daemon should get that packet, starts the server, and hands off the packet to it.  Such network servers are often referred to as network daemons.  Most spawn child processes for each incoming request.  This important service is configured either by editing a file /etc/inetd.conf, editing files in a directory /etc/xinetd.d, or enabling and then starting a systemd socket.

Analogy:  Suppose you live in a fancy apartment (or condo) building with a doorman, and suppose you are expecting a package to be delivered.  You can either wait by the door for the package yourself, or ask the doorman to accept the package when it arrives and deliver it to you.  You then take a nap while waiting.

Having many people wait for packages at once is inefficient; it works best to have a doorman wait for any package, and alert the correct person when a package arrives for that person.  On the other hand, there is an extra delay for having a doorman wait while you wake up from your nap.

(x)inetd/systemd is like the doorman; it can be told to expect packets for some daemon, and wake up that daemon when it arrives.  But if that extra delay is not acceptable (some daemons take a long time to initialize), the daemon itself can be listening for arriving packets (stand-alone).

To see what is listening on a given port, use (as root) fuser [-v] port/proto  (for example: fuser ssh/tcp or fuser 22/tcp).  lsof port/proto works too.  For all listening ports, use (as root) lsof -i -sTCP:LISTEN or netstat -tl.

Network sockets have proven so useful and easy to work with, that many types now exist.  Besides the ones for TCP, UDP, and IP (“raw”), there are sockets that support other network protocols, sockets for kernel-to-process communications (netlink sockets, used for example by udev), and process-to-process communications (unix sockets, similar to named pipes).  Sockets have been references throughout this course; now you know what they are.

TCP/IP Transports: Connection and Connectionless

The IP addresses are sufficient to route a packet from one computer to the destination.  However, two issues remain:  How to identify the source process (client) and destination process (server)?  The answer to this is to include another header (the transport layer header) that includes the source and destination port numbers as described above.

The other issue is dealing with errors that can occur.  One common approach is to have the sending process expect an acknowledgment from the receiving process.  If no such reply is found after a time-out period then the sender can resend the packet.  Implementing this correctly for every client and server gets old fast!

Another solution is to have the network itself guarantee delivery of the packets.  In this scheme, the two hosts set up a session, send the data, and tear down the session when done.  The TCP/IP system handles the time-outs and other issues.

The first scheme is known as datagram or connectionless service and is called the UDP in TCP/IP protocol suite.  The second scheme is known as a virtual circuit, or more commonly a connection-oriented service and is called TCP.

RPC

Sun developed a different scheme for connecting a server to a port number.  Instead of using a well-known port number for each service, a single well-known port number (111) is used for the program portmapper (or portmap).  This program assigns each RPC service a unique port number at each boot.  A client wanting to use some RPC service sends a query to the portmapper, requesting the port number for that service.  This scheme is full of security holes, and should be turned off on your server unless you are using RPC services.  These include NFS (<v4) and rlogin.

Another scheme is to use DNS “SRV” records to state an IP address and port number to use for each supported service.

The OSI Networking Model

Due to the complexity of networking, the various functions and terms have been standardized by the ISO.  Known as the open systems interconnect (OSI) this model of networking is standard knowledge for all IT workers.  Here’s a picture showing how TCP/IP networking compares (from fiberbit.com.tw):

Configuring Basic Networking

The NIC must be configured at boot time with many parameters, such as an IP address and mask.  In addition, your host will need a default gateway address to configure its routing table.  To use DNS, your computer must be assigned a hostname, a default domain name, and must be configured with the IP address of a DNS server to use to translate names to IP addresses.

An ISP’s DNS is not always the best choice to use.  Often these will never give you an error, but instead redirect a bad URL to a page of ads.  They will sometimes also track a user’s lookups for marketing purposes.  You can use alternative public DNS servers, such as 4.2.2.1 or 8.8.8.8, or OpenDNS.

(If you have Verizon as your residential ISP, you can currently (2015) opt-out of this, what Verizon calls “DNS Assistance”.

The easiest way to configure TCP/IP networking is to let someone else do it.  One way to achieve this is to configure your system to use DHCP (dynamic host configuration protocol) for each NIC.  When the system brings up the NIC (usually at boot time), it will send a broadcast DHCP request packet.  If there is a DHCP server listening on that LAN, it responds with all the required networking parameters.  Your system uses that data to configure networking.

The other way to configure networking parameters is by manually editing various configuration files (or using some tool to edit those files).  This is called static addressing.

For wireless laptops, Fedora Linux systems come with a newer networking system called NetworkManager.  This software was poorly documented and didn’t work well for static, wired networking, but is much better now.  You can use chkconfig and service (or systemctl) to turn this daemon off and make sure it stays off, and then use those tools to turn on the older network service (you may have to install that).  However, Red Hat has modified NetworkManager to use the standard (for Red hat) config files; see /etc/NetworkManager/NetworkManager.conf.  (Debian has done something similar.)  Thus, there is no real need to switch network services.  Even the GUI config utility, nm-tool, will use the config standard files.

The configuration of NICs on Red Hat systems is controlled by the file /etc/sysconfig/network-scripts/ifcfg-NameOfNIC.  In many cases, NameOfNIC is eth0.  On some systems, NICs are named differently (e.g., “p7p1”).

By convention, the ifcfg file’s suffix is the same as the string given by the DEVICE directive in the configuration file itself.  (Some versions of Fedora at least depend on that.)  System-wide settings go in /etc/sysconfig/network.  Note that a setting in the ifcfg file will override the same system-wide setting, for that interface.

As with all RH config files under /etc/sysconfig, you can find documentation for each file in /usr/share/doc/initscripts/sysconfig.txt.

In the directions that follow, be sure to change eth0 to your NIC’s actual name.  (You can use dmesg to see what name the kernel gave your NIC, or the “ip link” command.)  To configure the system for DHCP, this file should look something like this (bold lines are the ones you might need to change):

# Intel Corporation 82557/8/9 [Ethernet Pro 100]
TYPE=Ethernet
NAME=enp0s3
UUID=5988518b-ab91-3a9f-a874-9d5c5d25c862
DEVICE=enp0s3
HWADDR=00:06:5B:3D:43:0F
...
BOOTPROTO=dhcp
ONBOOT=yes

This will cause the network system to configure everything using DHCP.  To enable a normal user to set the interface up or down, add “USERCTL=yes”.

If using DHCP, you can add additional entries to control the configuration:  Add “PEERDNS=no” to prevent DHCP from updating /etc/resolv.conf.

For a static setup, this file should look like this (only the bold lines should be edited):

# Intel Corporation 82557/8/9 [Ethernet Pro 100]
TYPE=Ethernet
NAME=enp0s3
UUID=5988518b-ab91-3a9f-a874-9d5c5d25c862
DEVICE=enp0s3
HWADDR=00:06:5B:3D:43:0F
...
ONBOOT=yes
BOOTPROTO=none
IPADDR=192.168.0.7
PREFIX=24
GATEWAY=192.168.0.4

(The IPADDR and PREFIX can be followed by a number, as modern NICS and Linux support multiple addresses/prefixes per NIC.)  That will configure the IP address and mask, and the default route.  But not DNS.  To configure the DNS system when not using DHCP, you can edit the file /etc/resolv.conf, which should look something like the following:

search hccfl.edu
nameserver 169.139.222.4
nameserver 169.139.222.15

NetworkManager added new entries to the config file to also configure DNS and other aspects of networking, see below.  But I still just edit /etc/resolv.conf for that.

You must first turn off DHCP or changes to that file will be lost when DHCP does turn off.  It may also pay to make a copy of that file first, so you can see the IP addresses of the nameservers from the copy.

If using PEERDNS=no (or if not using DHCP), you can instead add entries such as “DNS[1|2|3]=ip-address” and “SEARCH="gcaw.org hccfl.edu"” to update resolv.conf with that information.  Thus you don’t need to edit resolv.conf.

Finally, you may need to set the hostname.  The default of “localhost.localdomain” is fine for most purposes.  If you do need to set a different hostname, use the hostname or hostnamectl command.  (There is no standard file to edit on a modern Linux system to set this, although many systems will pay attention to /etc/hostname.  You should also add an entry to /etc/hosts with your static IP address and hostname.)

Configuring Service Daemons Review

Stand-alone services are simple: you start some systemd service unit or run some Sys-V init script.  To have the service start at boot time, you enable it.  With systemd, on-demand services create a new service unit file for each incoming request, stored on a RAM disk.  These unit files are created from a template service unit, “nameOfService@.service”.  (The “@” identifies this as a template unit file.)

While all systemd managed daemons have sockets, for stand-alone services you can generally ignore them.  However, for on-demand services you need to start the socket (since there is no service unit yet to start and starting the template does nothing).  To enable on-demand services at boot time, you enable the socket.

(Do not confuse systemd socket unit files with the general networking concept of a socket.)

Network Security

At the host level, you have network security in several sub-systems: a packet filtering firewall (iptables or filewalld on Linux, and similar ones for other Unixes) is the first line of defense.  This can be used to allow or deny incoming or outgoing packets, and to collect various statistics.  TCP Wrappers can be used to examine incoming service requests and either allow or deny them (and log them).  Note the packet filter can also be configured for this; however, TCP Wrappers can allow or deny access on information not in the packet (time of day, availability of some resource, the username making the request, etc.)

The various services that listen for incoming requests can generally also be configured for security.  Additionally, any network services that authenticate users will likely use PAM or other security systems; you must remember to configure those too.

Incoming data can such as FTP and WebDAV uploads, email, etc., can be scanned for viruses and other malware.  A malware scanner such as clamAV is used for this.

You can run network monitoring tools that look for known attacks or any suspicious activity, and block access to attacking hosts (as well as log and alert the SA).

Perfect security is an illusion!  Even with all this security, attackers can get in.  By setting up file permissions carefully and using service isolation techniques, you can limit or prevent a successful penetration from doing (much) harm to your server.

Network Trouble-Shooting Tools and Techniques

Common tools are ping, traceroute, mtr (a modern Linux replacement for those two tools), top, ipcalc and ethtool or mii-tool (Linux), host, dig, ip (Linux; ifconfig for Unix), and various log files.

Before using any tools, verify the problem exists and is network related (and not a turned off server).

Test with ping (or mtr) first, which tests layers <3 (i.e., basic IP connectivity).  Then if that works, try nc (netcat) or telnet next (to say port 80 on a web server, or port 25 on a mail server).  If that fails, the problem is usually a firewall between client and server blocking access.

If ping fails, try traceroute to localize the fault.  The problem is almost always a bad cable, connector, or NIC, or some joker unplugged something.

If the problem appears to be with a host, use the command ifconfig to examine the NIC parameters and route to examine the routing table.

Also, examine the resolver files to check for DNS errors.  Use nslookup and/or dig to check for DNS server errors (discussed in more detail below).

To examine the firewall setup (including masquerading) you use iptables -L [-v]; iptables -t nat -L [-v] to list all rules (if you use iptables and not firewalld).  For firewalld, use firewalld-cmd --list-all.)

Often the best way to troubleshoot networking issues is to examine the IP packets.  This is a useful technique for security monitoring as well.  Various tools exist that can show and store in a file network packets.  You can show all or, using a filter, only those packets of interest.

The tool wireshark is the most common and powerful tool for this.  Out of the box wireshark knows hundreds of protocols and can dissect them for you.  It has an easy GUI interface for basic tasks, although such a powerful tool can be challenging to master.

Sometime you need a command line tool in order to use it from a timer service (capture packets at a certain time of day) or for use in a shell script.  tcpdump is a command line tool for this that is very mature.  But if you don’t want to learn it, you can use tshark, the command line version of wireshark.