Data mining :
Data mining is the process of extracting patterns from data. As more data are gathered, with the amount of data doubling every three years, data mining is becoming an increasingly important tool to transform these data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery.
While data mining can be used to uncover patterns in data samples, it is important to be aware that the use of non-representative samples of data may produce results that are not indicative of the domain. Similarly, data mining will not find patterns that may be present in the domain, if those patterns are not present in the sample being “mined”. There is a tendency for insufficiently knowledgeable “consumers” of the results to attribute “magical abilities” to data mining, treating the technique as a sort of all-seeing crystal ball. Like any other tool, it only functions in conjunction with the appropriate raw material: in this case, indicative and representative data that the user must first collect. Further, the discovery of a particular pattern in a particular set of data does not necessarily mean that pattern is representative of the whole population from which that data was drawn. Hence, an important part of the process is the verification and validation of patterns on other samples of data.
The term data mining has also been used in a related but negative sense, to mean the deliberate searching for apparent but not necessarily representative patterns in large numbers of data. To avoid confusion with the other sense, the terms data dredging and data snooping are often used. Note, however, that dredging and snooping can be (and sometimes are) used as exploratory tools when developing and clarifying hypotheses.
Profiling practices :
One of the most challenging problems of the information society is dealing with the increasing data overload. Due to the digitalization of all sorts of content and due to the improvement and drop in cost of recording technologies, the amount of available information is enormous and is increasing exponentially. It has thus become important for companies, governments and individuals to be able to discriminate information from noise, detecting those data that are useful or interesting. The development of profiling technologies must be seen against this background. These technologies are thought to efficiently collect and analyse data in order to find or test knowledge in the form of statistical patterns between data. This process is called Knowledge Discovery in Databases (KDD) (Fayyad, Piatetsky-Shapiro & Smyth 1996), which provides the profiler with sets of correlated data that are used as “profiles”.
Profiling practices refer to the whole process of construction and application of profiles, as defined above. This entry focuses on profiles that have been generated by computerized profiling technologies. What characterizes profiling technologies is the use of algorithms or other mathematical techniques that allow one to discover patterns or correlations in large quantities of data, aggregated in data bases. When these patterns or correlations are used to identify or represent people they can be called profiles. Other than a discussion of profiling technologies or population profiling the notion of profiling practices is not just about the construction of profiles but also concerns the application of group profiles to individuals, e.g. in the case of credit scoring, price discrimination or identification of security risks (Hildebrandt & Gutwirth 2008) (Elmer 2004).
Profiling is a matter of computerized pattern recognition, profiling practices enable refined price-discrimination, targeted servicing, detection of fraud, extensive social sorting. Real time machine profiling constitutes the precondition for emerging socio-technical infrastructures envisioned by advocates of Ambient intelligence, Autonomic Computing (Kephart & Chess 2003) and Ubiquitous computing (Weiser 1991).
Surveillance (pronounced /sərˈveɪ.əns/ or /sərˈveɪləns/) is the monitoring of the behavior, activities, or other changing information, usually of people and often in a surreptitious manner. It most usually refers to observation of individuals or groups by government organizations, but disease surveillance, for example, is monitoring the progress of a disease in a community.
The word surveillance comes from the French word for “watching over”.
The word surveillance may be applied to observation from a distance by means of electronic equipment (such as CCTV cameras), or interception of electronically transmitted information (such as Internet traffic or phone calls). It may also refer to simple, relatively no- or low-technology methods such as human intelligence agents and postal interception.
Surveillance is very useful to governments and law enforcement to maintain social control, recognize and monitor threats, and prevent/investigate criminal activity. With the advent of programs such as the Total Information Awareness program and ADVISE, technologies such as high speed surveillance computers and biometrics software, and laws such as the Communications Assistance For Law Enforcement Act, governments now possess an unprecedented ability to monitor the activities of their subjects.
However, many civil rights and privacy groups such as the Electronic Frontier Foundation and ACLU have expressed concern that by allowing continual increases in government surveillance of citizens that we will end up in a mass surveillance society, with extremely limited, or non-existent political and/or personal freedoms. Fears such as this have lead to numerous lawsuits such as Hepting v. AT&T.
“This article is brought to you by Gus Woltmann”.