NFS Tracing, Komputer, More Hacking
[ Pobierz całość w formacie PDF ]
NFS Tracing By Passive Network MonitoringMatt BlazeDepartment of Computer Science Princeton University mab@cs.princeton.eduABSTRACTTraces of filesystem activity have proven to be useful for a wide variety ofpurposes, rang ing from quantitative analysis of system behavior totrace-driven simulation of filesystem algo rithms. Such traces can bedifficult to obtain, however, usually entailing modification of thefilesystems to be monitored and runtime overhead for the period of thetrace. Largely because of these difficulties, a surprisingly small number offilesystem traces have been conducted, and few sample workloads areavailable to filesystem researchers.This paper describes a portable toolkit for deriving approximate traces ofNFS [1] activity by non-intrusively monitoring the Ethernet traffic to andfrom the file server. The toolkit uses a promiscuous Ethernet listenerinterface (such as the Packetfilter[2]) to read and reconstruct NFS-relatedRPC packets intended for the server. It produces traces of the NFS activityas well as a plausible set of corresponding client system calls. The tool iscurrently in use at Princeton and other sites, and is available viaanonymous ftp.1. MotivationTraces of real workloads form an important part of virtually all analysis ofcomputer system behavior, whether it is program hot spots, memory accesspatterns, or filesystem activity that is being studied. In the case offilesystem activity, obtaining useful traces is particularly challenging.Filesystem behavior can span long time periods, often making it necessary tocollect huge traces over weeks or even months. Modification of thefilesystem to collect trace data is often difficult, and may result inunacceptable runtime overhead. Distributed filesystems exa cerbate thesedifficulties, especially when the network is composed of a large number ofheterogeneous machines. As a result of these difficulties, only a relativelysmall number of traces of Unix filesystem workloads have been conducted,primarily in computing research environments. [3], [4] and [5] are examplesof such traces.Since distributed filesystems work by transmitting their activity over anetwork, it would seem reasonable to obtain traces of such systems byplacing a "tap" on the network and collecting trace data based on thenetwork traffic. Ethernet[6] based networks lend themselves to this approachparticularly well, since traffic is broadcast to all machines connected to agiven subnetwork. A number of general-purpose network monitoring tools areavail able that "promiscuously" listen to the Ethernet to which they areconnected; Sun's etherfind[7] is an example of such a tool. While thesetools are useful for observing (and collecting statistics on) specific typesof packets, the information they provide is at too low a level to be usefulfor building filesystem traces. Filesystem operations may span severalpackets, and may be meaningful only in the context of other, previousoperations.Some work has been done on characterizing the impact of NFS traffic onnetwork load. In [8], for example, the results of a study are reported inwhich Ethernet traffic was monitored and statistics gathered on NFSactivity. While useful for understanding traffic patterns and developing aqueueing model of NFS loads, these previous stu dies do not use the networktraffic to analyze the file access traffic patterns of the system, focusinginstead on developing a statistical model of the individual packet sources,destinations, and types.This paper describes a toolkit for collecting traces of NFS file accessactivity by monitoring Ethernet traffic. A "spy" machine with a promiscuousEthernet interface is connected to the same network as the file server. EachNFS-related packet is analyzed and a trace is produced at an appropriatelevel of detail. The tool can record the low level NFS calls themselves oran approximation of the user-level system calls (open, close, etc.) thattriggered the activity.We partition the problem of deriving NFS activity from raw network trafficinto two fairly distinct subprob lems: that of decoding the low-level NFSoperations from the packets on the network, and that of translating theselow-level commands back into user-level system calls. Hence, the toolkitconsists of two basic parts, an "RPC decoder" (rpcspy) and the "NFSanalyzer" (nfstrace). rpcspy communicates with a low-level networkmonitoring facility (such as Sun's NIT [9] or the Packetfilter [2]) to readand reconstruct the RPC transactions (call and reply) that make up each NFScommand. nfstrace takes the output of rpcspy and reconstructs the sys temcalls that occurred as well as other interesting data it can derive aboutthe structure of the filesystem, such as the mappings between NFS filehandles and Unix file names. Since there is not a clean one-to-one mappingbetween system calls and lower-level NFS commands, nfstrace uses some simpleheuristics to guess a reasonable approximation of what really occurred.1.1. A Spy's View of the NFS ProtocolsIt is well beyond the scope of this paper to describe the protocols used byNFS; for a detailed description of how NFS works, the reader is referred to[10], [11], and [12]. What follows is a very brief overview of how NFSactivity translates into Ethernet packets.An NFS network consists of servers, to which filesystems are physicallyconnected, and clients, which per form operations on remote serverfilesystems as if the disks were locally connected. A particular machine canbe a client or a server or both. Clients mount remote server filesystems intheir local hierarchy just as they do local filesystems; from the user'sperspective, files on NFS and local filesystems are (for the most part)indistinguishable, and can be manipulated with the usual filesystem calls.The interface between client and server is defined in terms of 17 remoteprocedure call (RPC) operations. Remote files (and directories) are referredto by a file handle that uniquely identifies the file to the server. Thereare operations to read and write bytes of a file (read, write), obtain afile's attributes (getattr), obtain the contents of directories (lookup,readdir), create files (create), and so forth. While most of theseoperations are direct analogs of Unix system calls, notably absent are openand close operations; no client state information is maintained at theserver, so there is no need to inform the server explicitly when a file isin use. Clients can maintain buffer cache entries for NFS files, but mustverify that the blocks are still valid (by checking the last write time withthe getattr operation) before using the cached data.An RPC transaction consists of a call message (with arguments) from theclient to the server and a reply mes sage (with return data) from the serverto the client. NFS RPC calls are transmitted using the UDP/IP connectionless unreliable datagram protocol[13]. The call message contains a uniquetransaction identifier which is included in the reply message to enable theclient to match the reply with its call. The data in both messages isencoded in an "external data representation" (XDR), which provides amachine-independent standard for byte order, etc.Note that the NFS server maintains no state information about its clients,and knows nothing about the context of each operation outside of thearguments to the operation itself.2. The rpcspy Programrpcspy is the interface to the system-dependent Ethernet monitoringfacility; it produces a trace of the RPC calls issued between a given set ofclients and servers. At present, there are versions of rpcspy for a numberof BSD-derived systems, including ULTRIX (with the Packetfilter[2]), SunOS(with NIT[9]), and the IBM RT running AOS (with the Stanford enet filter).For each RPC transaction monitored, rpcspy produces an ASCII recordcontaining a timestamp, the name of the server, the client, the length oftime the command took to execute, the name of the RPC command executed, andthe command- specific arguments and return data. Currently, rpcspyunderstands and can decode the 17 NFS RPC commands, and there are hooks toallow other RPC services (for example, NIS) to be added reasonably easily.The output may be read directly or piped into another program (such asnfstrace) for further analysis; the for mat is designed to be reasonablyfriendly to both the human reader and other programs (such as nfstrace orawk).Since each RPC transaction consists of two messages, a call and a reply,rpcspy waits until it receives both these components and emits a singlerecord for the entire transaction. The basic output format is 8 vertical-barseparated fields:timestamp | execution-time | server | client | command-name | arguments |reply-datawhere timestamp is the time the reply message was received, execution-timeis the time (in microseconds) that elapsed between the call and reply,server is the name (or IP address) of the server, client is the name (or IPaddress) of the client followed by the userid that issued the command,command-name is the name of the particular program invoked (read, write,getattr, etc.), and arguments and reply-data are the command dependentarguments and return values passed to and from the RPC program,respectively.The exact format of the argument and reply data is dependent on the specificcommand issued and the level of detail the user wants logged. For example, atypical NFS command is recorded as follows:690529992.167140 | 11717 | paramount | merckx.321 | read |{"7b1f00000000083c", 0, 8192} | ok, 1871In this example, uid 321 at client "merckx" issued an NFS read command toserver "paramount". The reply was issued at (Unix time) 690529992.167140seconds; the call command occurred 11717 microseconds earlier. Threearguments are logged for the read call: the file handle from which to read(repr...
[ Pobierz całość w formacie PDF ]