[Lucas10] 5.5. Useful Report Types

来源:百度文库 编辑:神马文学网 时间:2024/04/29 12:34:29
5.5. Useful Report Types
flow-reportsupports more than 70 types of reports and lets you analyze yourtraffic in more ways than you may have thought possible. In thissection, I'll demonstrate the most commonly useful report types. Readtheflow-report man page for the complete list.
Note:
Manyof these reports are most useful when presented in graphical format orwhen prepared on a filtered subset of data. You'll look at how to doboth of these later in this chapter.
5.5.1. IP Address Reports
Many traffic analysis problems focus on individual IP addresses. You've already spent some quality time with theip-source-address report. These reports work similarly, but they have their own unique characteristics.
5.5.1.1. Highest Data Exchange: ip-address
To report on all flows by host, use theip-addressreport. This totals both the flows sent and the flows received by thehost. Here, you look for the host that processed the largest number ofoctets on the network. You lose the data's bidirectional nature, butthis report quickly identifies your most network-intensive host.
# flow-cat * | flow-report -v TYPE=ip-address -v SORT=+octets
ip-address flows octets packets duration
192.0.2.4 107785 995021734 1656178 1659809423
192.0.2.37 24294 347444011 456952 322712670
207.46.209.247 50 134705214 151227 5934644
...
5.5.1.2. Flows by Recipient: ip-destination-address
This is the opposite of theip-source-address report I used as an example report throughout the beginning of this chapter. It reports on traffic by destination address.
# flow-cat * | flow-report -v TYPE=ip-destination-address
ip-destination-address flows octets packets duration
158.43.128.72 16478 1090268 16816 49357139
192.0.2.37 12131 239545903 252438 162963278
198.6.1.1 7630 588990 7997 26371278
...
In this example, the host158.43.128.72 has received 16,478 flows in 1,090,268 octets. Lots ofpeople transmitted data to this host. You don't know whether this datais the result of connections initiated by this host or whether manyhosts are connecting to this host. To answer that, you have to look atthe actual connections. Useflow-nfilter to trim your data down to show only the flows involving this host, and useflow-print to see the data.
5.5.1.3. Most Connected Source: ip-source-address-destination-count
Manyworms scan networks trying to find vulnerable hosts. If you have a worminfection, you'll want to know which host sends traffic to the greatestnumber of other hosts on the network. Theip-source-address-destination-count report shows exactly this.
Code View:Scroll/
# flow-cat * | flow-report -v TYPE=ip-source-address-destination-count
ip-source-address ip-destination-address-count flows octets packets duration
192.0.2.37 ? 1298 12163 107898108 204514 159749392
158.43.128.72 5 16389 1962766 16711 49357139
192.0.2.4 2016 54280 127877204 785592 831419980
...
This report shows you thatthe host 192.0.2.37 (?) sent flows to 1,298 (?) other hosts, as well asthe number of flows, octets, and packets of these connections.
5.5.1.4. Most Connected Destination: ip-destination-address-source-count
Youcan also count the number of sources that connect to each destination.This is similar to the previous report but will contain slightlydifferent data. Some flows (such as broadcasts and some ICMP) go inonly one direction, so you must consider destinations separately fromsources.
Code View:Scroll/
# flow-cat * | flow-report -v TYPE=ip-destination-address-source-count
ip-destination-address ip-source-address-count flows octets packets duration
158.43.128.72 5 16478 1090268 16816 49357139
192.0.2.37 1303 12131 239545903 252438 162963278
198.6.1.1 2 7630 588990 7997 26371278
...
Theip-source-address-destination-count andip-destination-address-source-count reports give additional insight into the key servers, resources, and users on your network, even when you don't have problems.
REPORTS THAT DON'T SORT BY EVERYTHING
Some reports don't offer the opportunity to sort by every field. For example, the two interconnectedness reports cannot sort by the number of hosts an address connects to. This is annoying, especially because this is precisely what you're interested in if you're running this report! On most flavors of Unix you can sort by a column by piping the output through sort -rnk columnnumber, as shown here:
flow-cat | flow-report | sort -rnk 2
5.5.2. Network Protocol and Port Reports
Thesereports identify the network ports used by TCP and UDP flows or, on alarger scale, just how much traffic is TCP, UDP, and other protocols.
5.5.2.1. Ports Used: ip-port
Forget about source and destination addresses. What TCP and UDP protocols are the most heavily used on your network? Theip-port report tells you.
# flow-cat * | flow-report -v TYPE=ip-port -v SORT=+octets
ip-port flows octets packets duration
80 ? 63344 877141857 1298560 1444603541
25 8903 361725472 475912 139074162
443 10379 136012764 346935 324609472
...
This looks suspiciouslylike assorted Internet services. Port 80 (?) is regular web traffic;port 25 (?) is email; and port 443 (?) is encrypted web traffic. Youcan see how much traffic involves each of these ports, but it's acombination of inbound and outbound traffic. For example, you know that63,344 (?) flows either started or finished on port 80. These could beto a web server on the network or web client requests to servers offthe network. To narrow this down, you really must filter the flows youexamine, run a more specific report, or both. Still, this offers afairly realistic answer to the question "How much of the traffic is webbrowsing or email?" especially if you use the+percent-total option.
5.5.2.2. Flow Origination: ip-source-port
To see the originating port of a flow, use theip-source-port report. Here I'm sorting the ports in ascending order:
# flow-cat * | flow-report -v TYPE=ip-source-port -v SORT=-key
ip-source-port flows octets packets duration
0 215 4053775 23056 21289759
22 111 1281556 15044 4816416
25 4437 10489387 181655 69456345
49 19 3922 79 5135
...
Flows with a source port ofzero (?) are probably ICMP and certainly not TCP or UDP. It's best tofilter your data to only TCP and UDP before running this report.Although ICMP flows use a destination port to represent the ICMP typeand code, ICMP flows have no source port.
Sourceports with low numbers, such as those in the previously shown reportsnippet, are almost certainly responses to services running on thoseports. In a normal network, port 22 is SSH, port 25 is SMTP, and port49 is TACACS.
5.5.2.3. Flow Termination: ip-destination-port
The reportip-destination-port identifies flow termination ports.
# flow-cat * | flow-report -v TYPE=ip-destination-port -v SORT=-key
ip-destination-port flows octets packets duration
0 91 3993212 22259 14707048
22 231 26563846 22155 5421745
25 4466 351236085 294257 69617817
49 19 6785 101 5135
...
These look an awful lot likethe source ports. What gives? Because a flow is half of a TCP/IPconnection, the destination port might be the destination for the dataflowing from the server to the client. A report on the same data shouldshow just roughly as many flows starting on a port as you terminate onthat port. Sorting the report by port makes this very obvious.
5.5.2.4. Individual Connections: ip-source/destination-port
Part of the identifying information for a single TCP/IP connection is the source port and a destination port. Theip-source/destination-portreport groups flows by common source and destination ports. Here, I'mreporting on port pairs and sorting them by the number of octets:
Code View:Scroll/
# flow-cat * | flow-report -v TYPE=ip-source/destination-port -v SORT=+octets
ip-source-port ip-destination-port flows octets packets duration
80 15193 3 62721604 43920 620243
4500 4500 115 57272960 101806 30176444
14592 25 2 28556024 19054 480319
...
The first connection at ? appearsto be responses to a web request, coming from port 80 to ahigh-numbered port. Three separate flows used this combination ofports. Then at ? there is IPSec NAT-T traffic on port 4500 and thentransmissions to the email server at ?.
Ifind this report most useful after I prefilter the data to include onlya pair of hosts, which gives me an idea of the traffic being exchangedbetween the two. You might also use this report to identifyhigh-bandwidth connections and filter on those ports to identify thehosts involved, but if you're interested in the hosts exchanging themost traffic, theip-address report is more suitable.
5.5.2.5. Network Protocols: ip-protocol
How much of your traffic is TCP, and how much is UDP? Do you have other protocols running on your network? Theip-protocol report breaks down the protocols that appear on your network. In this example, I'm using the+names option to haveflow-report print the protocol name from /etc/protocols rather than using the protocol number. Looking up names in a static file is much faster than DNS resolution.
# flow-cat * | flow-report -v TYPE=ip-protocol -v OPTIONS=+names
ip-protocol flows octets packets duration
icmp 158 75123 965 6719987
tcp 83639 1516003823 2298691 1989109659
udp 76554 69321656 217741 296940177
esp 34 3820688 18720 8880078
vrrp 12 151708 3298 3491379
As you can see, clearly TCP andUDP are our most common protocols, but there is also an interestingamount of ESP traffic. ESP is one of the protocols used for IPSec VPNs.
REPORTS COMBINING ADDRESSES AND PORTS
flow-report also supports reports that provide source and destination addresses and ports together, in almost any combination. Read the flow-report manual page for the specific names of these reports. I find flow-print more useful for that type of analysis.
5.5.3. Traffic Size Reports
Has the number of large data transfers increased on your network over the past few days? If so,flow-reportlets you dissect your traffic records and identify trends. Thesereports are most useful when graphed and compared to historical trafficpatterns.
5.5.3.1. Packet Size: packet-size
Howlarge are the packets crossing your network? You're probably familiarwith the 1,500-byte limit on packet size, but how many packets actuallyreach that size? Thepacket-size report counts the packets of each size. Here I'm running this report and sorting the results by packet size:
# flow-cat * | flow-report -v TYPE=packet-size -v SORT=+key
packet size/flow flows octets packets duration
1500 ? 5 ? 2776500 ?1851 1406780
1499 2 14717980 9816 390603
1498 5 60253559 40207 999167
...
As you can see at ?,1,500-byte packets have been seen in five flows, containing a total of2.7 million bytes (?). You've seen 1851, (?) of these 1,500-bytepackets. Sorting by packets would identify the most and least commonpacket size.
5.5.3.2. Bytes per Flow: octets
Howlarge are your individual flows? Do you have more large networktransactions or small ones? To answer this, report on the bytes perflow with theoctets report.
# flow-cat * | flow-report -v TYPE=octets -v SORT=-key
octets/flow flows octets packets duration
46 ? 367 ?16882 367 1214778
48 ? 59 2832 ? 59 272782
...
168 ? 496 83328 ? 1311 5819361
...
This network had 367 46-octet flows (?), for a total of 16,882 (?) octets.
Ina small flow, the number of flows (?) probably equals the number ofpackets (?), and each of these tiny flows has only one packet. Whenflows contain more data, each flow (?) contains multiple packets (?).
5.5.3.3. Packets per Flow: packets
Thenumber of packets in a flow offers an idea of what kind of transactionsis most common on your network. The sample DNS queries you looked at inChapter 1had only one packet in each flow, while long-running FTP sessions mighthave thousands or millions of packets. The packets per flow reportpackets tells you how many flows have each number of packets, as shown here:
# flow-cat * | flow-report -v TYPE=packets -v SORT=-key
packets/flow flows octets packets duration
1 ? 74213 6978064 74213 19224735
2 3411 551388 6822 190544194
3 4764 2033952 14292 37130046
...
As you can see at ?,this data contains 74,213 one-packet flows, carrying almost 7 millionoctets. (That sounds so much more impressive than 6.5MB, doesn't it?)
5.5.4. Traffic Speed Reports
I'vehad managers ask "How fast is the network?" so often that I've given uptelling them that the question is meaningless. Saying that you have agigabit Ethernet backbone sounds good, but it's like saying that yourcar's speedometer goes up to 120 miles per hour without mentioning thatthe engine starts making a sickly ratcheting cough at 40 miles perhour. Here are some ways to take a stab at answering that question insomething approaching a meaningful way.
NETWORK SPEED CAVEATS
Remember, a long-running connection can pass traffic at different speeds through its lifetime. You've probably seen this yourself when downloading a CD or DVD image from the Internet. The connection might start off very quick, become slower partway through for any number of reasons, and accelerate again later. Flow records include only average speed information. For example, to determine how many packets a flow passed in a second, flow-report divides the number of seconds the flow lasted by the number of packets in the flow. This is good enough for most purposes, however, and it's certainly better than you'll get with anything short of packet capture.
Many of these reports are not terribly informative to the naked eye. When you're looking at a list of TCP ports that accepted connections, you can quickly say that, for example, ports 80 and 25 received 75 percent of your traffic. These reports are not so readily interpreted, though they do make good fodder for graphs. I've written these descriptions assuming you'll be feeding the results into a graphing program such as gnuplot or OpenOffice.org Calc. Those of you who can interpret these results without further processing can congratulate yourself on being either much smarter or much dumber than myself.
5.5.4.1. Counting Packets: pps
Anothercritical measure of network throughput is packets per second (pps).Many network vendors describe the limits of their equipment in packetsper second. Thepps report, much like thebps report, shows how many flows travel at the given number of packets per second.
# flow-cat * | flow-report -v TYPE=pps -v SORT=+key
pps/flow flows octets packets duration
3000 1 231 ? 3 ? 1
1000 70 4192 70 70
833 1 403 5 6
...
Wow! One flow went at 3,000packets per second (?)? Yes, technically, but note that it containedonly three packets (?) and lasted for one millisecond (?). Multiplyinganything by a thousand exaggerates its impact.
Again, this report isn't terribly useful to the naked eye but can be interesting when graphed.
5.5.4.2. Traffic at a Given Time: linear-interpolated-flows-octets-packets
Thelinear-interpolated-flows-octets-packetsreport averages all the flows fed into it and lists how many flows,octets, and packets passed each second. I find this the most useful"speed" report.
# flow-cat * | flow-report -v TYPE=linear-interpolated-flows-octets-packets
unix-secs flows octets packets
1334981479 35.605553 35334.016293 96.820015
1334981480 63.780553 62865.828793 184.570015
1334981481 38.702116 69297.703533 192.235604
...
The first column gives the time, in Unix epochal seconds:1334981479is equivalent to Saturday, April 21, 2012, at 11 minutes and 19 secondsafter midnight, EDT. Each row that follows is one second later. In thissecond, I passed 35.6 flows, 35334 octets, and 96.8 packets.
Thisreport is ideal for answering many frequently asked questions, such as"How much traffic goes between our desktops and the domain controllersat remote sites?" Take the flow data from your internal connection, runit throughflow-nfilter once toreduce it to traffic from your desktop address ranges, and then run itthrough again to trim that down to traffic with your remote domaincontrollers. Finally, feed the results intoflow-report, and use the resulting report to generate graphable data.
Alittle creativity will give you data on things you never expected youcould see. For example, TCP resets are a sign of something being notquite right; either a client is misconfigured, a server daemon hasstopped working, or a TCP connection has become so scrambled that oneside or the other says "Hang up, I'm done."
You can useflow-nfilter to strip your flows down to TCP resets. (One TCP reset is one packet.) Run this report with-v FIELDS=-octets,-flows to display only the number of TCP resets in a given second, and you'll then have graphable data on TCP resets on your network.
Note:
Althoughthe quality of a network administrator's work is difficult to measure,I suggest offering "before" and "after" pictures of TCP reset activityduring your performance reviews.
Remember that a certain level of TCP reset activity is normal, and much of it is caused by buggy or graceless software. Do not let your boss give you a goal of "zero TCP resets" during your performance review.
5.5.5. Routing, Interfaces, and Next Hops
Flowrecords include information on which interfaces a packet uses. Insimple cases, this information isn't terribly useful: If you have arouter with one Internet connection and one Ethernet interface, youhave a really good idea how packets flowed without any fancy networkanalysis. A router using BGP, with multiple Internet providers, willdistribute outgoing traffic to all Internet providers based on itsrouting information. Most people usetracerouteto identify which path the router takes to a particular destination.BGP routing is dynamic, however; the path a packet takes now might notbe the path it took five minutes ago or during an outage. Routers thatsupport NetFlow version 5 or newer include interface information witheach flow, however, so you can retroactively identify the route a flowused. (Remember, software flow sensors, such assoftflowd, do not have access to interface information.)
5.5.5.1. Interfaces and Flow Data
InChapter 4, you filtered on router interfaces. Reporting on interfaces is the natural extension.
Remember,each router represents its interfaces with numbers. You might think ofa router interface as Serial 0, but the router might call it Interface8. A Cisco router might renumber its interfaces after a reboot, unlessyou use thesnmp ifIndex persist option.
I'm using the router fromChapter 4as a source of flow information. On this router, interfaces 1 and 2 arelocal Ethernet ports, and interfaces 7 and 8 are T1 circuits to twodifferent Internet service providers.
5.5.5.2. The First Interface: input-interface
To see which interface a flow entered a router on, use theinput-interface report. Here, I'm adding a filter to report on data only for a single router:
Code View:Scroll/
# flow-cat * | flow-nfilter -F router1-exports | flow-report -v TYPE=input-interface
input-interface flows octets packets duration
1 22976 136933632 306843 58766984
2 320 182048 1307 3214392
7 4934 59690118 165408 46161632
8 1316 7386629 11142 7592624
Most of these flows start on interface 1, with the fewest on interface 2.
5.5.5.3. The Last Interface: output-interface
To show the interfaces that are being used to leave a router, use theoutput-interface report.
Code View:Scroll/
# flow-cat * | flow-nfilter -F router1-exports | flow-report -v TYPE=output-interface
output-interface flows octets packets duration
0 1765 447958 5043 3599588
1 5057 66979701 175073 52900988
2 17545 20507633 56531 9440036
7 111 43079633 34710 8266712
8 5068 73177502 213343 41528308
The first thing I noticein this output is the sudden appearance of interface 0 at ?. This is alist of valid router interfaces, and 0 isn't one of them. What gives?
Theseare flows that arrived at the router and never left. The appearance ofinterface 0 prompted me to more closely scrutinize my flow data, and Ifound flows that appeared to come from IP addresses reserved forinternal, private use. Closer inspection of the firewall revealedseveral rules that allowed internal traffic to reach theInternet-facing network segment without address translation, but therouter dropped all traffic from these internal-only RFC1918 addresses.I also found quite a few traffic streams from the public Internet thatwere sourced from these private IP addresses, but the router droppedthem too, as I would expect.
Thelesson to learn is, of course, that proper reporting will do nothingbut make more work for you. But at least you'll be able to identify andfix problems before an outsider can use those problems against you.
5.5.5.4. The Throughput Matrix: input/output-interface
Putting theinput/output-interface reports together into a matrix showing which traffic arrived on which interface can be illuminating. Use theinput/output-interfacereport for this. Here, I'm sorting the output by the number of flows soyou can easily tell which pairs of interfaces see the greatest numberof connections.
# flow-cat * | flow-nfilter -F router1-exports
| flow-report -v TYPE=input/output-interface -v SORT=+flows
input-interface output-interface flows octets packets duration
1 ? 2 17539 20507195 56522 9438220
1 8 4801 73147574 212806 41001424
7 1 3888 59604956 164102 45390152
8 1 1169 7374745 10971 7510836
7 0 1040 84724 1297 769664
1 0 525 199230 2805 60628
2 8 267 29928 537 526884
8 0 147 11884 171 81788
1 7 111 ?43079633 34710 8266712
2 0 53 152120 770 2687508
7 2 6 438 9 1816
The busiest connection isbetween interface 1 (FastEthernet0/0) and interface 2(FastEthernet0/1), shown at ? and ?. This might or might not makesense, depending on your network topology. In mine, it's expected. Youthen route traffic out one of the Internet circuits at ?.
Thisreport clearly indicates the difference between flows and bytes. As youcan see at ?, one of the connections with a relatively few flows isactually pushing a comparatively large amount of traffic.
Notethe absence of flows between interfaces 7 and 8. This indicates trafficentering on one of the Internet circuits and leaving by the other. Youwould have become a transit provider, carrying Internet traffic fromone network to another. This would happen if, say, you sold a T1 to athird party and they sent their traffic through you to the Internet. Ifyou're not an Internet backbone, this would be a serious problem.
5.5.5.5. The Next Address: ip-next-hop-address
Reportingby interface gives you a good general idea of where traffic is going,and reporting by IP address offers a detailed view. For an intermediateview, use theip-next-hop-addressreport. You do not need to filter this report by the router offeringthe flows, because the report doesn't confuse the results withinterface numbers. This report effectively tells you where the networktraffic is going, both for your local hosts and for your upstreamproviders. I've sorted this report by octets so that thehighest-bandwidth next hops appear first.
# flow-cat * | flow-report -v TYPE=ip-next-hop-address | -v SORT=+octets
ip-next-hop-address flows octets packets duration
192.0.2.4 2490 154742050 174836 31131852
95.116.11.45 5068 73177502 213343 41528308
192.0.2.37 5944 65552868 73357 13692932
12.119.119.161 111 43079633 34710 8266712
192.0.2.13 2370 21382159 21595 4996548
192.0.2.194 17545 20507633 56531 9440036
66.125.104.149 17534 20506982 56521 9447180
...
As you can see in this report at ?, the most heavily used next hop is the proxy server. This isn't a great surprise.
Thesecond most heavily used next hop isn't even an address on my network.It's the ISP's side of one of my T1 circuits, as shown at ?. This hoprepresents traffic leaving my network. I also have IP addresses for thesecond (?) and third ISPs (?).
5.5.5.6. Where Traffic Comes from and How It Gets There: ip-source-address/output-interface
flow-reportincludes reports based on IP addresses and interfaces. Because thesereports are so similar, I'll cover two in detail, and I'll let youfigure out the rest.
Theip-source-address/output-interfacereport shows the source address of a flow and the interface the flowleft the router on. If you filter your underlying flow data by anindividual host of interest and run that data through this report,you'll get information on how much traffic this one host sent to eachof your Internet circuits as well as information on how much data thathost received from each remote host. In the following report, I'mfiltering by the router exporting the flows to avoid interface numberconfusion and by the IP address of my main external NAT address:
# flow-cat * | flow-nfilter -F router1-exports
| flow-nfilter -F ip-addr -v ADDR=192.0.2.4 | flow-report -v
TYPE=ip-source-address/output-interface
ip-source-address output-interface flows octets packets duration
192.0.2.4 2 3553 422428 5849 1881348
192.0.2.4 8 324 3826147 69225 2851980
198.22.63.8 1 137 56475 762 915472
...
192.0.2.4 7 2 124 2 0
...
The first entry at ? gives the proxyserver itself as a source address and shows that I sent many flows outinterface 2. That's an Ethernet to another local network segment. Youalso see at ? that the proxy sent large amounts of traffic out interface 8, one of the Internet connections.
You'll see entries at ? for remote IP addresses that send a comparatively small number of flows.
The surprising entry here is at ? where the proxy server sends a really small amount of traffic out interface 7, the other Internet circuit. Measurements show that this other circuit is consistently heavily used. Whatever is using this circuit isn't the main proxy server, however. I could identify the traffic going out over this circuit by removing the filter on the main proxy server and adding a filter for interface 7. I'll do that with a different report.
5.5.5.7. Where Traffic Goes, and How It Gets There: ip-destination-address/input-interface
After viewing the results of the previous report, I'm curious about what hosts exchange traffic over interface 7. InChapter 4,I created a filter that passed all traffic crossing interface 7. You'lluse that filter on traffic from this router together with a flow reportto see what's happening. Rather than reporting on the source addressand the output interface, you'll useip-destination-address/input-interfaceto see where traffic arriving on a particular interface is going. Theresulting command line might be long enough to scare small children,but it will answer the question.
# flow-cat * | flow-nfilter -F router1-exports
| flow-nfilter -F interface7 | flow-report -v TYPE=ip-dest
ination-address/input-interface -v SORT=+octets
ip-source-address input-interface flows octets packets duration
192.0.2.7 7 2 27347244 221223601016
69.147.97.45 1 2 19246168 12853 232400
192.0.2.8 7 2 15442834 117793600988
76.122.146.90 1 2 14113638 562143601884
...
Remember, I designed the filterinterface7so that it matched traffic either entering or leaving over interface 7.That's why this report includes both output interfaces 1 and 7.
Twolocal hosts, both web servers, receive most of the traffic sent to youover this router interface. More investigation shows that the Internetprovider for this line has good connectivity to home Internetproviders, such as Comcast and AT&T. The other provider has betterconnectivity to business customers. (How do you know what kind ofconnectivity your providers have? You can extract this information fromthe BGP information in flow records.)
5.5.5.8. Other Address and Interface Reports
Flow-report includes two more reports for interfaces and addresses,ip-source-address/input-interface andip-destination-address/output-interface. After the previous two examples, you should have no trouble using or interpreting these reports.
5.5.6. Reporting Sensor Output
If you have multiple sensors feeding a single collector, you might want to know how much data each sensor transmits. Use theip-exporter-address report to find that out.
# flow-cat * | flow-report -v TYPE=ip-exporter-address
ip-exporter-address flows octets packets duration
192.0.2.3 29546 204192427 484700 115735632
192.0.2.12 36750 202920788 231118 39230288
As you can see in thisreport, records from the first router included fewer flows but moreoctets than those from the second router. Your results will varydepending on the throughput of each of your routers, the kind oftraffic they carry, and their sampling rate.
5.5.7. BGP Reports
Flowrecords exported from BGP-speaking hardware include Autonomous System(AS) information. Reporting on this information tells you which remotenetworks you are communicating with and even how you reached thosenetworks. If your network does not use BGP, these report types are ofno use to you, and you can skip the rest of this chapter.
Flow-toolsincludes many reports and tools of interest to transit providers, butfew if any readers of this book are transit providers. BGP-usingreadnsers are probably clients of multiple ISPs and use multipleproviders for redundancy. I'll cover flow BGP information from the BGPuser's perspective. Transit providers reading this book are encouraged to read theflow-report manual page for a complete list of reports involving AS numbers.
Both of you.
5.5.7.1. Using AS Information
Whatpossible use can this type of AS information be for a network engineer?Knowing who you exchange traffic with might have little impact onday-to-day troubleshooting, but it has a great impact on who youpurchase bandwidth from.
Whenthe time comes to choose Internet providers, run reports to see who youconsistently exchange the most traffic with. If you know that youconsistently need good connectivity to three particular remotenetworks, use them as bullet points in your negotiations withproviders. If you don't make bandwidth purchasing decisions, providethe decision maker with this information. The statement "40 percent ofour Internet traffic goes to these three companies" is much moreauthoritative than "I worked with company X before, and they werepretty good."
5.5.7.2. Traffic's Network of Origin: source-as
Thesource-as report identifies the AS where flows originated. I'll sort this report by octets, because that's how I pay for bandwidth.
# flow-cat * | flow-nfilter -F router1-exports
| flow-report -v TYPE=source-as -v SORT=+octets
source-as flows octets packets duration
0 23024 137077373 307572 61361020
14779 2 19246168 12853 232400
33668 136 15664027 64345 14486504
21502 2 5087464 3450 47692
...
The invalid AS0appears first at ?. When traffic originates or terminates locally, flowsensors do not record an AS number for the local end of that traffic,which means that your local AS number will never appear in a flowreport. The source AS of traffic you transmit shows up as0, and the destination address of traffic you receive is also shown as0.The only time you will see both a source and a destination AS number ina flow record is if the exporter is a transit provider, such as anInternet backbone. Traffic coming from AS0is the total of all traffic sourced by your network and transmitted toother networks. You might want to filter out all flows with a source ASof0 from your data before running the report to remove the information about the data that your network transmits.
In the hour shown in the earlier report, the largest amount of traffic originated from AS14779 (Inktomi/Yahoo!, shown at ?), but it included only two flows. I suspect that if you were to filter this same data on AS14779 and runflow-print against it, you'd see that someone had downloaded a file, and I would further guess that the proxy logs would show that someone needed printer software. You can repeat this exercise for each of the AS numbers in the list.
5.5.7.3. Destination Network: destination-as
To see where you're sending traffic to, use thedestination-as report. For a slightly different view of traffic, sort by the number of flows.
# flow-cat * | flow-nfilter -F router1-exports
| flow-report -v TYPE=destination-as -v SORT=+flows
destination-as flows octets packets duration
702 11834 767610 11869 23248
0 6828 67428097 180125 56502392
701 4154 459973 6372 1893152
3209 397 553003 9751 1220164
...
As you can see at ?, you sent more flows to AS702 than you received from everybody else combined (shown at ?). Also note at ? that AS701 belongs to the same organization as AS702, butflow-reportdoes not aggregate them. Different autonomous systems within oneorganization almost certainly have slightly different routing policies,despite the best efforts of their owner to coordinate their technicalteams.
5.5.7.4. BGP Reports and Friendly Names
TheBGP reports can use friendly names, which will save you the trouble oflooking up the owner of interesting AS numbers. Althoughwhoisis a fast command-line tool for checking the ownership of AS numbersand you can look up AS information on any number of registry websites,none of these interfaces is suitable for automation.flow-tools gets around this by keeping a static list of AS assignments in the file asn.sym. Registries are continuously assigning and revoking AS numbers, however, so the list included withflow-tools will quickly become obsolete. To get the best information on AS names, you must updateflow-tools' list of AS numbers.
To update the list of AS numbers, first download the latest list of ARIN assignments fromftp://ftp.arin.net/info/asn.txt, and save it on your analysis server.
Flow-tools includes gasn,a small script to strip out all of ARIN's comments and instructions andconvert ARIN's list to a format it understands. The standardflow-toolsinstallation process doesn't install this rarely used program in one ofthe regular system program directories, but you can probably find itunder /usr/local/flow-tools/share/flow-tools if you installed from source. Uselocate gasn to find the program if you installed from an operating system package.
Here, you feed ARIN's asn.txt in the current directory to the gasn script located in /usr/local/flow-tools/share/gasn and produce a new file, newasn.sym:
# cat asn.txt | perl /usr/local/share/flow-tools/gasn > newasn.sym
Take a look at your newasn.sym file. The contents should resemble these:
0 IANA-RSVD-0
1 LVLT-1
2 DCN-AS
3 MIT-GATEWAYS
...
As of this writing, this file contains more than 64,000 AS numbers, each on its own line.
Your system should already have an existing asn.sym file, possibly in /usr/local/etc/flow-tools/ or /usr/local/flow-tools/etc/sym/. Replace that file with your new file. When you add the+names option toflow-report, you should see the current AS names.
# flow-cat * | flow-nfilter -F router1-exports
| flow-report -v TYPE=source-as -v SORT=+octets -v OPTIONS=+names
source-as flows octets packets duration
IANA-RSVD-0 23024 137077373 307572 61361020
INKTOMI-LAWSON 2 19246168 12853 232400
MICROSOFT-CORP---MSN-AS-BLOCK 64 4669359 3259 106792
GOOGLE 130 4052114 3184 616152
...
Not all AS numbers havesensible names, especially those that are acronyms or words in foreignlanguages, but you can easily pick out some of the obvious ones andidentify the others withwhois.