With large, diverse
geospatial datasets and an integrated, user-centric interface to analyze
user-defined algorithms, the Instrument will give researchers the opportunity
to experiment with spatially-aware information and knowledge discovery
algorithms.
This
will allow geospatial pattern mining and discovery of interesting geospatial
patterns on spatial objects in rare event
detection, spatio-temporal change and trend detection,
and correlation mining.
For
example, in rare event detection, unpredictable events are
extremely difficult to detect because they don't occur often or they occur at a
time/location where they are not expected (e.g., detecting traffic congesting
during a disaster situation). The
Instrument will provide researchers with historical data that is used to establish a baseline for dynamic event behavior models, and
scenarios when deviations from the normal model
are identified. Algorithms can then be built
to better enable prediction models. These can
be input into the Instrument for testing and future rare event detection and
prediction.
The Instrument will also aid in the investigation
of bootstrapping techniques [CBO+02,
JDC87] and cost-sensitive learning approaches [Elk01,ZE01] for rare event detection in spatial data with
semantic awareness.
The general event
detection problem has been intensively studied in traditional document
collections as Topic Detection and Tracking (TDT) [APL98]. However, different
from traditional documents, harvesting Twitter data for event detection is
quite challenging since the Twitter messages are very short and noisy,
containing nonstandard terms such as abbreviations, acronyms, and emoticons
[Eis13, LWJ12]. The Instrument would
allow the linking of tweet messages with many static data sources and the
identification of the messages referring to the same real-world entity. As a result, the integration can provide an
aggregated view of the disaster domain where the attributes and the values of
the attributes are assembled and fused from thousands of mainly unstructured
messages and improve the accuracy of event detection.
The
Oak Ridge National Laboratory [http://www.ornl.gov/] (see
letter) will utilize the Instrument to tie together sensor data
across the US to create a real-time detection and alert system. Their SensorNet
Project will utilize the Instrument's analytics and super-resolution data.
The team of Dr. S.S.
Iyengar [http://users.cis.fiu.edu/~iyengar/]
at FIU will utilize the Instrument to perform two case studies on trend
analysis and monitoring data behavior. The Instrument's sensor data collected
can be used for optimizing strategies for risk assessment as well as detecting
critical events based on mining information. The optimization algorithm
monitors the dynamic behavior of the data and identifies distinctive features
in the data set.
References Cited
[CBO+02] N. V.
Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. "SMOTE: Synthetic Minority Over-sampling
TEchnique." Journal of Artificial Intelligence
Research, 16:321-357, 2002.
[JDC87] A.
K. Jain, R. C. Dubes, and C.-C. Chen.
"Bootstrap techniques for error estimation." IEEE Transactions on Pattern
Analysis and Machine Intelligence, 9:628-633, 1987.
[Elk01] C. Elkan. "The foundations of cost-sensitive
learning." In IJCAI, pages 973-978, 2001.
[ZE01] B. Zadrozny and C. Elkan. "Learning and making decisions when costs and probabilities are
both unknown. In Proceedings of the seventh ACM SIGKDD international
conference on Knowledge discovery and data mining, pages 204-213, 2001.
[APL98] J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and
tracking. In Proceedings of the 21st annual international ACM SIGIR
conference on Research and development in information retrieval, pages 37-45. ACM, 1998.
[EIS13] J. Eisenstein. What
to do about bad language on the internet. In Proceedings of the 2013 Conference
of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies (NAACL/HLT), 2013.
[LWJ12] F. Liu, F. Weng, and X. Jiang. A broad-coverage normalization system for social media language.
In Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics (ACL), pages 1035-1044, 2012.