News and Notes
- Class on Thursday (2/11/99) is cancelled . To compensate, each student is expected to read the following paper which will be discussed in the following week.
- J. Han, " Data Mining ", in J. Urban and P. Dasgupta (eds.), Encyclopedia of Distributed Computing , Kluwer Academic Publishers, 1999.
[ Note: This paper is available as a postscript (.ps) file.
You can download a postscript viewer like GSview from .]
- As mentioned in class, your oracle accounts can be accessed using the machine Your oracle id is the same as your gl id, and the initial
password assignment is as described in class.
Please include the following command in your .cshrc:
source /opt/bin/oracle_cshrc
You can then connect to the database using sqlplus. For those of you
not familiar with sql, you can change your password using the command
alter user id identified by newpasswd;
- Paper discussed in class on 2/18:
M.S. Chen, J. Han, and P.S. Yu, ``
Data Mining: An Overview from a Database Perspective'', IEEE
Transactions on Knowledge and Data Engineering, 8(6): 866-883, 1996
- The Project has been assigned.
- Topics covered in class on 2/25/99:
- Building Classification Models: ID3 and C4.5
- Machine Learning Tutorial by David W. Aha, Naval Research Laboratory.
- On 4/01 and 4/06, we will be discussing the seminal paper on Association
Rules by Agrawal et al.
- Prof. Ramakrishnan at Vriginia Tech has some interesting slides
to illustrate association Rules.
- Paper presented in Class on 4/06: BIRCH: A New
Data Clustering Algorithm and Its Applications, T. Zhang,
R. Ramakrishnan, M. Livny, DMKD Journal, vol1, no2.
- Papers presented in Class on 4/08:
- Sprint:
A scalable, parallel classifier for data mining, Shafer, Agrawal
and Mehta, Proc. 22nd VLDB conference
- NERF c-Means: Non-Euclidean relational fuzzy clustering, R. J. Hathaway and J. C. Bezdek,Pattern Recognition, vol. 27, No. 3, pp. 429-437, 1994.
- On 4/13 and 4/15, we will discuss web mining. Initial readings can be found on
my web page.
- Papers presented in class on 4/20:
- R. Ng, L. V. S. Lakshmanan, J. Han and T. Mah,
Exploratory Mining via Constrained Frequent Set Queries
'', Proc. 1999 ACM-SIGMOD Conf. on Management of Data (SIGMOD'99) (demo),
, Philadelphia, PA, June 1999.
- Heikki Mannila and Hannu Toivonen:
Multiple uses of frequent sets and condensed representations.
In Second International Conference on Knowledge Discovery and
Data Mining, Portland, Oregon, August 2-4, 1996,
- Papers presented in class on 4/22:
- Sampling large databases for association rules
by Hannu Toivonen.
In 22th International Conference on Very Large Databases (VLDB'96),
134 - 145, Mumbay, India, September 1996. Morgan Kaufmann
- A perspective on databases and data mining
by Marcel Holsheimer, Martin Kersten, Heikki Mannila, and Hannu Toivonen.
In First International Conference on Knowledge Discovery
and Data Mining (KDD'95),
150 - 155, Montreal, Canada, August 1995. AAAI Press.
- Exam Schedule has been pushed back by 4
days from the date originally annonced in class. The exam was handed
out on 4/20, and is due 5/4.
- Papers presented in class on 4/27:
- D. Tsur et. al., Query Flocks: A
Generalization of Association Rule Mining.,
Proceedings of the 1998 ACM SIGMOD Conference on
Management ofData, 1998.
- S. Brin, R. Motwani, C. Silverstein,Beyond Market
Baskets: Generalizing Association Rules to
Correlations.1997 ACM SIGMOD Conference on
Management of Data, 1997, pp. 265-276.
- Papers presented in class on 4/29:
- O. R. Zaiane, M. Xin, J. Han,
`` Discovering
Web Access Patterns and Trends by Applying OLAP and Data Mining Technology
on Web Logs'', Proc. Advances in Digital Libraries Conf. (ADL'98),
Santa Barbara, CA, April 1998, pp. 19-29.
- M. Perkowitz and O. Etzioni,Towards Adaptive Web Sites: Conceptual Framework and Case Study in Proceedings of WWW8. 1999.
- A FAQ about the exam has been added.
- Class on 5/4 will feature a guest lecture on text mining by Prof. Charles
- Paper covered in class on 5/6:
S. Santini, R. Jain Similarity Matching IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999 (in press)
- Papers presented in class on 5/11:
- Helena Ahonen, Oskari Heinonen, Mika Klemettinen, and
Inkeri Verkamo, Applying Data Mining
Techniques in Text Analysis.Report C-1997-23, University of Helsinki, Department of Computer Science, March 1997
- O. Zamir and O. Etzioni, Web Document Clustering: A feasibility demonstration, Proc. SIGIR 1998.
- Papers presented in class on 5/13:
- K.Koperski, J.Han, and J.
Adhikary, ``Mining
Knowledge in Geographical Data'', to appear in Communications
of ACM, 1998
- S. Santini, R. Jain Beyond Query by Example
Proceedings of the Sixth ACM International Multimedia Conference, ACM Multimedia '98, Bristol,
England, September 1998
- H. Lu, J. Han, and L. Feng,
`` Stock
Movement and N-Dimensional Inter-Transaction Association Rules '',
Proc. of 1998 SIGMOD'96 Workshop on Research Issues on Data Mining and
Knowledge Discovery (DMKD'98) , Seattle, Washington, June 1998.
- A Signup sheet has now been posted on my door for final project
demonstrations and turning in of reports. If you cannot come in person
to signup, please send me an email, and I will sign you up if that
slot is free. The period of coverage is 10th through the 20th.