TY - JOUR
T1 - Privacy-preserving network flow recording
AU - Shebaro, Bilal
AU - Crandall, Jedidiah R.
N1 - Funding Information:
This work was supported in part by the U.S. National Science Foundation ( CNS-0905177 , CNS-0844880 ). Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the National Science Foundation. We would also like to thank the DFRWS anonymous reviewers and others who gave valuable input, like Flocon 2011 attendees. Monzy Merza’s conversations with us and help with early coding and testing was very valuable, and we are very grateful for it.
PY - 2011/8
Y1 - 2011/8
N2 - Network flow recording is an important tool with applications that range from legal compliance and security auditing to network forensics, troubleshooting, and marketing. Unfortunately, current network flow recording technologies do not allow network operators to enforce a privacy policy on the data that is recorded, in particular how this data is stored and used within the organization. Challenges to building such a technology include the public key infrastructure, scalability, and gathering statistics about the data while still preserving privacy. We present a network flow recording technology that addresses these challenges by using Identity Based Encryption in combination with privacy-preserving semantics for on-the-fly statistics. We argue that our implementation supports a wide range of policies that cover many current applications of network flow recording. We also characterize the performance and scalability of our implementation and find that the encryption and statistics scale well and can easily keep up with the rate at which commodity systems can capture traffic, with a couple of interesting caveats about the size of the subnet that data is being recorded for and how statistics generation is affected by implementation details. We conclude that privacy-preserving network flow recording is possible at 10 gigabit rates for subnets as large as a /20 (4096 hosts). Because network flow recording is one of the most serious threats to web privacy today, we believe that developing technology to enforce a privacy policy on the recorded data is an important first step before policy makers can make decisions about how network operators can and should store and use network flow data. Our goal in this paper is to explore the tradeoffs of performance and scalability vs. privacy, and the usefulness of the recorded data in forensics vs. privacy.
AB - Network flow recording is an important tool with applications that range from legal compliance and security auditing to network forensics, troubleshooting, and marketing. Unfortunately, current network flow recording technologies do not allow network operators to enforce a privacy policy on the data that is recorded, in particular how this data is stored and used within the organization. Challenges to building such a technology include the public key infrastructure, scalability, and gathering statistics about the data while still preserving privacy. We present a network flow recording technology that addresses these challenges by using Identity Based Encryption in combination with privacy-preserving semantics for on-the-fly statistics. We argue that our implementation supports a wide range of policies that cover many current applications of network flow recording. We also characterize the performance and scalability of our implementation and find that the encryption and statistics scale well and can easily keep up with the rate at which commodity systems can capture traffic, with a couple of interesting caveats about the size of the subnet that data is being recorded for and how statistics generation is affected by implementation details. We conclude that privacy-preserving network flow recording is possible at 10 gigabit rates for subnets as large as a /20 (4096 hosts). Because network flow recording is one of the most serious threats to web privacy today, we believe that developing technology to enforce a privacy policy on the recorded data is an important first step before policy makers can make decisions about how network operators can and should store and use network flow data. Our goal in this paper is to explore the tradeoffs of performance and scalability vs. privacy, and the usefulness of the recorded data in forensics vs. privacy.
KW - Identity based encryption
KW - NetFlow
KW - Network forensics
KW - Privacy preserving semantics
KW - Statistical database
UR - http://www.scopus.com/inward/record.url?scp=79961039951&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79961039951&partnerID=8YFLogxK
U2 - 10.1016/j.diin.2011.05.011
DO - 10.1016/j.diin.2011.05.011
M3 - Article
AN - SCOPUS:79961039951
SN - 1742-2876
VL - 8
SP - S90-S100
JO - Digital Investigation
JF - Digital Investigation
IS - SUPPL.
ER -