Internet Archive
Home
EVENTS:

Colloquium 2000
Call For Participation

 
Internet Archive Colloquium 2000
March 8-9, 2000, The Presidio, San Francisco, CA
http://archive.org/content/events.html

SUBMISSION DEADLINE EXTENDED!!!

The Internet Archive is building a digital library to save important Internet based information from the 20th and 21st centuries for the 22nd century and beyond. We are interested in the long term preservation of the often transient, but important, digital information of the Internet for use by scientists, historians, journalists, and others. The first content collection is 20 Terabytes of Web pages (growing at 2TB/month) that will be hosted on disk for on-line access by researchers. The Internet Archive is also developing digital collections from other sources, including a moving images collection.

The Internet Archive is a resource for the world, and depends for its success on interaction, partnership, and cooperation with others. To this end, we are hosting our first colloquium, on March 8-9, 2000, in the Presidio of San Francisco. We invite you to participate in this exploration of our mission and help us plan for the future.

The colloquium will be a forum for discussion of the current status of the latest research and real world projects in the areas of digital libraries, building, understanding and using archives of Internet based information sources, and storage/indexing/access of large multi-media archives. Part of the workshop will be devoted to developing, as a group, a roadmap for the Internet Archive to follow in the next few years. Contributions and Participation The Internet Archive Colloquium will emphasize participant contribution in the form of oral/poster presentation, group discussion, and close interaction. We encourage all participants to contribute their ideas, knowledge, and vision for the future. This 1.5 day conference is limited to 30-40 people. We are interested in exploring the following:

TOPICS:
  • Long Term Preservation: How can we best plan for perpetual archival of the Internet? Issues of physical storage, software and data standards, and migration plans are important.
  • Data Mining on the Internet: Treating the Internet as a massive but unique database, what questions are of interest, and which tools/methodologies are best applied to answer those questions or discover relationships?
  • Search and Browsing: Even with a stable and logical addressing scheme, what will be needed for effective indexing and search of massive archives? Issues of scalability and usability are important.
  • Analysis of the Internet: What interesting measures can be made on the Internet as a data set? Growth rate and distribution? Connectivity? What tools can be used to do this? (e.g. statistics, machine learning, psychological and sociological models, operations research)
  • Working with Streaming Media: Issues related to collecting, storing, annotating, indexing, browsing, use of meta-data, and retrieval interfaces for Internet based audio/video archives.
  • Archiving the dynamically created Web: How to identify and save the material underlying dynamically constructed Web pages. What are the questions about such material that the archive should be able to address?
  • Toolkits for Archive Access: Software, frameworks, tools, APIs, standards to allow easy access to the Internet Archives resources. How should these be developed to meet users' needs, and how will they evolve to handle growth and the changing user interests and demands.
  • Social/Historical Issues: How can the Internet Archive serve as a resource to explore social evolution in the era of digital communications? How can adding a historical component change the perspective of how the Internet is used and understood? How does this affect the concept of search engine?
PROGRAM: The colloquium will be a mix of oral and poster presentations, a panel discussion, and a brainstorming session to plan the Internet Archive's roadmap for the future.
  • Presentations: We invite position/research/system "statements of interest" for oral/poster presentation or discussion. Submissions should emphasize novel and important ideas for the future as well as the outstanding challenges that must be met along the way. Statements should be related to one or more of the topics listed above, be 1 to 6 pages in length and may fall into one or more of the following categories:
    • Position: These include important issues that should be brought up for discussion. Standards, open technical, organizational, and other problems to be met in archiving the Internet are encouraged.
    • Research: These should represent timely, highly relevant research results and/or topics related to where research should be headed.
    • System: These should discuss technical or organizational systems practices, tools and software, and other developments from which interesting and important lessons can be learned.
    • Other: Submissions that don't fit into one of the above categories will be judged on their own merits on a case by case basis.
    All submissions should be in English, and you should specify your preference of presentation medium. However, be aware that not all requests may be met. Statements should be submitted electronically in PDF (preferred) or Postscript form by e-mailing them to submissions@archive.org by 7 December 1999. (NOW EXTENDED TO DECEMBER 21 1999)
  • Panel discussion: Information brings power, and so the creators of digital archives may have a significant influence on the future. The colloquium may include a panel discussion tentatively on the topic of on how we can responsibly build a large digital archive for the future. Specific issues may include: Given limited technological resources, how does one decide what information to save, and what is lost? How should issues of privacy, copyright, estimated future importance affect the archival process? Should all archived information be made public? Should be public be made part of the decision process?
  • Roadmap: The Internet Archive values and needs your ideas on how to build its digital library for the future. We will have a brainstorming session to discuss:
    1. What information is worth gathering?
    2. How should the information be stored and indexed?
    3. In what form should the archives appear and be accessed?
    4. What questions should answerable by using the stored archives?
    5. Who should our users be and how should we support them?
    6. What technologies and standards need to be developed for the archiving/preservation/retrieval of digital documents?

SCHEDULE:
21 December 1999 - Submission of interest to colloquium2000@archive.org
15 January 2000 - Final participant list announced
8-9 March 2000 - Internet Archive Colloquium 2000

LOGISTICS: The Internet Archive Colloquium will be held March 8-9 at the Golden Gate Club in the Presidio of San Francisco. Lunch both days and Dinner March 8 will be provided.

 More information: Hotels

FEES: There is no fee for attending, and there will be no honorariums or reimbursements offered, with few, if any exceptions made. We very much appreciate the time, money, and effort it will take to attend, and will do everything we can to make it worthwhile all around.

PROGRAM COMMITTEE:
Kurt Bollacker, Internet Archive (coordinator kurt@archive.org)
Gary Flake, NEC Research Institute (flake@research.nj.nec.com)
Lee Giles, NEC Research Institute (giles@research.nj.nec.com)
Michael Jordan, UC Berkeley (jordan@cs.Berkeley.edu)
Brewster Kahle, Alexa Internet (brewster@alexa.com)
Scott Kirkpatrick, IBM Watson (KIRK@watson.ibm.com)
Steve Lawrence, NEC Research Institute (lawrence@research.nj.nec.com)
Michael Lesk, NSF (lesk@acm.org)
Peter Lyman, UC Berkeley (SIMS) (plyman@sims.berkeley.edu)
Jim Pitkow, Xerox PARC (pitkow@parc.xerox.com)
Pam Samuelson, UC Berkeley (SIMS) (pam@sims.berkeley.edu)
Yoram Singer, Hebrew University of Israel (singer@cs.huji.ac.il)

If you have any questions, please feel free to contact Kurt Bollacker.

Kurt Bollacker
Technical Director, Internet Archive
kurt@archive.org