Announcements
The homepage is always under construction. Check the course description and syllabus below to decide if
this course suits you.
The paper presentation on April 12th has changed.
Instructor
Dr. Tao Li, Assistant Professor
School of Computer Science and Engineering
Florida International University
Office: ECS 318
Email: taoli AT cs.fiu.edu
Office Hours: Tuesday and Thursday 5:00pm-6:00pm or by appointment
Meeting Time and Location
Thursday 12:30pm-3:15pm, ECS 136
Course Description
Data Mining is one of the hottest fields in Computer
Science. Data has been accumulating throughout the computer age in many forms,
including database systems, spreadsheets, text files, and recently web pages.
Data mining aims to search through data for hidden relationships and patterns in
your data. This is a special topic course on data mining. We will cover
advanced topics such as web data mining, stream data mining, relational data
mining, tree/graph mining, spatiotemporal data indexing and mining,
privacy-preserving data mining, high-dimensional data clustering, basics of
natural language processing, social
network and linkage analysis,
This course will be highly beneficial to students whose
research interests are in database, data mining, bioinformatics, information
retrieval, decision science and artificial intelligence, and also to those who
may need to apply data mining to any application.
Course Syllabus (Subject to revision)
This is a seminar course that will focus on recent developments of advanced
data mining techniques and their applications to various problems. After the
introductory lectures, subsequent classes will mainly based on research
papers. Topics will cover:
- Overview of Basic Data Mining Techniques
- Mining Data Streams
- Relational Data Mining
- Tree/Graph Mining
- Spatiotemporal Data Indexing and Mining
- Privacy-preserving Data Mining
- Similarity Search
- High-Dimensional Data Clustering
- Social Network and Linkage Analysis
- Basics of Natural Language Processing
Prerequisites
COP5992 Principles of Data Mining or Consent of Instructor
Format and Grading
- A final grade will be based on the student's presentation and
participation (40%), assignments and the project (60%). Students who
demonstrate excellent research performance by developing the project to
publication will get extra scores in the grade.
-
Everyone needs to present at least one paper (with
high-quality PPT/PDF slides). The presenter will also be
responsible for leading group discussions and answering questions. Also, everyone needs to bring
one-page summary/comments of the papers to be presented in class and hand it in
right after the class presentations.
-
Everyone will conduct a research project during the course. The project
can be a comparative study on existing data mining algorithms for a
specific application, a development of new data mining algorithms which to
some extent improve the existing methods, a novel application of existing
methods to practical problems. The project can be done individually
or in group of two.
-
You are strongly encouraged to select the papers in excellent quality and
published or appeared in 2005, 2006 or 2007. Please discuss with me before you
finalize your paper selection.
-
Recommended conference proceedings: SIGKDD, SIGIR, ICML, SIGMOD, ICDM, SDM etc. Recommended journals: DMKD (Data Mining and Knowledge
Discovery), SIGKDD Explorations, Machine Learning,
Journal of Machine Learning Research, Knowledge and Information Systems (KAIS), IEEE TKDE,
etc. Use Google Scholar, citeseer or other Web services to find the papers you
want to select.
Textbooks and References
The course materials will mainly consist of presentation
and discussion of research papers and research project reports closely related
to the topics in data mining. A lot of reading material from top
conferences/journals will be made available online or in class as required. In
addition, lecture notes will be available on line.
The following textbooks are highly recommended: (You
should have at least one of those books)
- Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations.
Morgan Kaufmann Publishers, 2005.
- Jiawei Han and Micheline Kamber.
Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers,
2006.
- Pang-Ning Tan, Michael Steinbach and Vipin Kumar. Introduction to Data
Mining. Addison Wesley, 2005.
List of References:
- Tom Mitchell. Machine Learning.
McGraw Hill, 1997.
-
R. O. Duda
et al., Pattern Classification. Wiley Interscience
- Hastie, Tibshirani and Friedman.
The Elements of Statistical Learning.
Springer-Verlag, 2001.
- Chakrabarti.
Mining the Web: discovering knowledge from hypertext data.
Morgan Kaufmann , 2003. Available on line at
FIU Library.
Course Materials
Related Links
Code of Academic Integrity:
University Policies:
For academic misconduct, sexual harassment, religious holydays, and information on services for students with disabilities, see :
| ©2007 Tao Li. All rights reserved. |
last Updated:
|
|