Q&A Summaries for Session 1 Fireflies: Scalable Support for Intrusion-Tolerant Network Overlays Havard Johansen, Andre Allavena, Robbert van Renesse Q: Is it necessary for all nodes to have complete membership knowledge? A: Yes Q: How do you choose t? For example, do you choose differently for video streaming than for software repositories? A: The choice of t is based on the t -vs- 2t+1 byzantine predecessors requirement. It is always chosen based on the probability p that a member is antagonistic. This probability can be determined by the certificate authority if the certificate authority has some sense of whether the members are trustworthy. Q: Why can't you identify a node as malicious without rebuttal upon t+1 accusations? A: You can, but it is much more expensive than waiting on a single accusation. It's also worth noting that rebuttals and acusations are signed; this increases the trust of a single accusation. Q: The max bandwidth used is very low (50 bytes/second). How do you explain this? The measurement is not including the overhead of a TCP connection, only the payload. The payload is the full accusation and rebuttal that is transmitted during gossip, as well as the gossip overhead. Q: How did you choose the 20% Byzantine nodes in your benchmarks? This was chosen to ensure a sufficient number of antagonists to demonstrate the utility of the design; it is not an upper bound but it is high in practice (though not necessarily for networks of automated malicious bots). Argos: An Emulator for Fingerprinting Zero-Day Attacks Georgios Portokalidis, Asia Slowinska, Herbert Bos Q: Regarding signatures, since the emulator monitors instructions, are the signatures based on register values or are they based on network traffic? A: The signature is based on the network data transmitted, because the contents of registers and memory are not a good representation of the attack. Q: What do you do with the signature? A: The sweetbait tool allows the automatic distribution of signatures to control centers and intrusion detection centers. Q: The conclusions state that a disadvantage is that users interact with the emulator, not the server itself; and that this makes everything slower. Could you avoid this with a cluster of honeypots that just check the first few (perhaps 500) bytes of a TCP session? A: This isn't a good idea, because many attacks start with a non-invasive snoop phase, which makes the choice for the number of bytes very specific (and easy to defeat once this value is known). Q: While this "feel-out" phase does not indicate an attack, doesn't the slower behavior of the emulator suggest to an attacker that the server is a trap? A: A bot can't tell the difference, and automated attacks are the more interesting ones. To an automated agent, there is no way of distinguishing if the slowdown is due to emulation or to some other factor, such as network delays. Q: How do you deal with software that legitimitely sends code across the network? A: You could store the code on disk, at which point it loses its taint, and then reload the code after that point. It's also worth noting that bytecode is not native code, so it doesn't trigger emulation and is thus not an issue. Q: What is the percent of attacks that you can catch once you have a signature? A: The signatures are crude, but in a test using net traces we showed that there were no false positives with a good number of catches. There is a possibility of false negatives, however. One simple example is that denial of service attacks are not avoidable with this technique. Practical Taint-based Protection Using Demand Emulation Alex Ho, Michael Fetterman, Christopher Clark, Andrew Warfield, Steven Hand Q: Can't you leverage read-only code pages in Linux to get around this problem? A: In our system, both the application and kernel are protected, and it may not be appropriate to protect the kernel pages. For example, Skype rewrites code on the fly, and thus using the no-execute bit would not be appropriate. Furthermore, this is not a unix-only proposal, and it is critical to support the full x86 architecture, which supports self-modifying code. Q: QEMU doesn't emulate SSE3 or multiple cpus. How do you handle this? A: The newest version of QEMU does have some support for multiprocessor, but it isn't sufficient for their purposes. For SSE, the authors put a lot of work into QEMU to add the ability to emulate SSE. Q: If this strategy is used on a desktop machine, won't the end result be pollution of the entire system and full emulation? A: No. Tainting is on a page-by-page basis, but more importantly, the tainting will only affect those applications that have tainted data; the rest of the applications in a multiprogrammed environment will not have tainted pages, and therefore won't be affected. Q: What if all of the executables on the system come from NFS? A: You ought to establish some sort of trust with the local NFS server to ensure that packets from NFS servers aren't tainted. In addition, there is a tool to "bless" or untaint disk blocks, which would allow code downloaded from the web to be blessed and then run without emulation. Q: Where is the bigger win: eliminating false taints, or reducing the switching costs? A: It's hard to say, but the intuition is that the bigger win would be in reducing the switching time. Since the system runs atop Xen, the VMM handles all exceptions and interrupts. This results in a "double bounce" when an emulated program receives an interrupt. First there is a bounce into Xen and out to the host OS to handle the interrupt; then there is a bounce into Xen and back to the emulator to resume processing. If the emulator was in the VMM, or if the interrupt handling was inside of the emulator, this cost of four crossings would be dramatically reduced. Q: Regarding emulation, on a taint fault, can you pass the byte-level granularity taint datastructures? A: No; the virtualized hardware does not do byte-level granularity, only page-level granularity. As a result, the only time that byte tainting occurrs is inside of QEMU.