Table of Contents
The ISRL/Prairienet systems staff spent the second week of January, 2003, in a
project-planning retreat. The goal of the retreat was to more clearly
define the numerous major projects being considered as an area of focus
for spring/summer 2003. Descriptions, impacts, and costs for 14 projects
were covered during the week.
Critical dependencies were also assessed and documented. Along with
these projects, various ongoing and
one time tasks were noted.
This information has been reviewed, and the following goals and timelines
have been set for the spring and summer, 2003. (Timelines are based on
Brynnen and Paul each working 15 hours/week on these projects. Their
remaining time will be split between ongoing and one-time tasks.)
These projects represent a
major effort to complete infrastructure development begun with the move of
ISRL facilities from the TIS building in spring 2001. Indeed, early critical
work on the infrastructure began in 2000 with Prairienet; these efforts will
also provide a unified administrative platform for these systems. Overall,
the efforts will facilitate disaster recovery, unify and simplify management
and administration of systems, and provide the foundational pieces needed to
more readily add new services.
A primary focus in developing a core technology infrastructure is to come up with a design that can not only meet today's needs but also be a foundation on which technology for tomorrow can be based. Some of the key components of GSLIS's current infrastructure, such as user authentication and authorization, are based on a framework developed 15 years ago. We are working to develop a framework that will hopefully serve GSLIS another 15 years.
But to accomplish this, there are key challenges that need to be addressed. The level of complexity in systems and networks has grown at an exciting or alarming rate, depending on your perspective. The possibilities for ubiquitous computing and more importantly for the changes in the way humans will interact as a result, provide a fertile ground for research and development into tomorrow's future. A framework needs to be developed to allow GSLIS to take a lead in portions of this work.
However, while the future is evolving, the day to day tasks still need to be accomplished. Providing mechanisms to meet the day to day activities in a stable manner while also providing the core infrastructure to allow implementation of the emerging technologies requires a much broader foundation than was required in the past. The kind of core infrastructure needed is almost anyone's guess when an individual might carry multiple network-ready devices, when the network itself is constantly changing as we move from room to room, when even the concept of an operating system is challenged as the Internet assumes a broader role in resource management.
Certain key components in such a core technology infrastructure do emerge, however, and are our current focus as we build a framework for future work. The development of the design has included hundreds of hours of research into possible off-the-shelf alternatives, discussions regarding what outsourcing options are available, discussions with peers at conventions across the country, and of course numerous meetings with personnel within GSLIS. Key philosophies inherent within the design include:
- A move away from a small number of large scale servers to multiple
instances of a common platform to provide redundancy of systems designed to
meet targeted needs, limit the effects of performance degradation by one
service or user on other services and users, and provide high degrees of
scalability. By using a common platform, the cost of maintaining
additional systems is minimal once the initial cost for the overall core is
considered. However, this method is limited inasmuch as the non-technical
infrastructure (e.g., electrical, cooling) can only support a certain
number of systems. Future growth may require a move to new technologies
that allow higher densities of computers in limited spaces such as blade
technologies. The move to a common platform trades decreased flexibility
in who can have elevated privileges on the system for lower long-term
maintenance costs. Eventually, a goal is to have the ability to quickly
create new systems to meet targeted needs that authenticate through the
core infrastructure while being maintained by end users.
- Implementation of a common authentication/authorization mechanism that
is based on an open standards protocol, LDAP. This allows multiple
operating system platforms and Internet services to interact using a
single user information database. Further, this database can store a
myriad of data fields that allow it to be authoritative not only for
systems access, but even web tree access, for instance.
- Mechanisms for managing the ever increasing disk space required by
users. This increased demand for storage necessitates not only more
hard drives and more systems, but also mechanisms for management of
these. The ability to track usage and performance, balance load, provide
appropriate levels of tested backups and archives that include strategies
for recovery, and a number of other management issues create ever greater
levels of complexity as amounts of storage grow.
Over the last several years, considerable effort has gone into developing
these key components. Work has often been slowed as we brought together
systems based on differing paradigms while these systems continued to be used on a daily basis. Nevertheless, at this point many of these components have now been field tested both on systems supporting Prairienet's users, and also ISRL users. However, because in each case components were of necessity implemented even as the overall design continued to be developed, there are critical differences between the implementations. Further, over the years, the design has matured in significant ways that are as yet not implemented in any single instance. Our goal over the next six months is to create a unified platform using current design concepts. This platform would not only serve Prairienet and ISRL's needs, but also provide a basis for a broader integration of systems within a unified platform across GSLIS.
Information within these lists is updated bi-weekly. The last update of
current time spent and project status was made July 28, 2003.
- Project Title:
CANIS Migration
- Project Abstract:
-
Users on Griffin are using minimally supported aging equipment based
on a project-specific (CANIS) administation model. The desired outcome
is the migration of active accounts and projects to common infrastructure.
- Projected Start Date:
-
Fall 2002
- Projected Completion Date:
-
Original: March 14, 2003; Revised March 31, 2003
- Major Steps:
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Account Creation |
1 hour |
?? |
completed |
| Moving user home directories |
1 hour |
|
completed |
| Moving user email |
1 hour per user |
|
completed |
| Moving CANIS CNAME |
0.5 hour |
0.5 hour |
completed |
| Removing systems from server room |
2 hours |
0.45 hours |
completed |
| Summary: |
7.5 hours |
8.45 hours |
100% Complete |
- Notes:
-
Awaiting users completion of migration of their data. Will shutdown
final systems 2/28/03 and will de-rack shortly after that.
Users should have migrated off. However, one Windows system still
authenticates to the NIS server on Griffen. Because the NIS auth
software on Windows is a long-forgotten hack, only a reimage of the
Windows workstation will fix this. Thus, retirement of Griffen is
now dependent on the Windows Integration project.
Windows workstation has been reinstalled. Griffin has been powered
off and is awaiting deracking. Will likely occur by end of March.
- Project Title:
CVS Implementation for Systems Staff
- Project Abstract:
-
Develop implementation plan for concurrent Versioning System (CVS)
to allow systems staff to share working sets of system files
(including source files and documentation), to unroll changes, and
to merge changes.
- Projected Start Date:
- January 20, 2003
- Projected Completion Date:
- January 24, 2003
- Major Steps (estimated time/time committed):
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| CVS Installation/Update |
Unscheduled |
2 hours |
100% Complete |
| Time to develop the guide |
4 hours |
7 hours |
100% Complete |
| Time for acclimation |
1 hour/admin (3 admins) |
1.5 hours |
100% Complete |
| Summary: |
7 hours |
10.5 hours |
100% Complete |
- Notes:
-
Martin needs to familiarize himself with CVS.
- Project Title:
Disaster Recovery
- Project Abstract:
-
We need to have a full plan of attack in the case of catastrophic
failure of any central server, including improved reliability and
performance of backups, off-site storage of backups, prioritization
policy for order of system recovery, recovery procedure checklist,
and periodic simulated system recoveries.
While backups are currently happening, the new backup scheme will help
in the following ways:
- Move to more efficient software program (rsync) to do backups.
Currently the backup of ISRL's home shares takes 12 hours using
the tar program;
- Currently backups of a given filesystem are always made to the
same destination filesystem. If the source and destination both
fail, the data is lost. The new system will rotate destination
filesystems to spread out how many drives need to fail before
data is lost;
- Currently the number of backups of a given filesystem are statically
defined. New system will allow as many backups as the destination
drive space allows;
- The new system will allow easier monitoring of status.
- Projected Start Date:
- January 27, 2003
- Projected Completion Date:
- Original: February 28, 2003; Revised: August 22, 2003
- Major Steps:
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Documentation of Existing Backup Strategy |
unscheduled |
4 hours |
100% Complete |
| Stopgap Measures |
unscheduled |
1 hours |
100% Complete |
| Research Commercial & Outsourcing Alternatives |
Unscheduled |
2.5 hours |
100% Complete |
| Policy planning |
6 hours |
3 hours |
75% |
| Project planning |
6 hours |
5 hours |
100% Complete |
| Software Design |
6 hours |
29 hours |
80% Complete |
| Software Coding |
8 hours |
92.5 hours |
Proof of Concept Coding Completed
Production Code 67% Complete
|
| Software Testing |
2 hours |
|
|
| Recovery procedure development |
24 hours |
0 |
100% Complete (from Design) |
| Recovery procedure testing |
2 hours |
|
|
| Documentation |
4 hours |
|
|
| Recovery checklist |
8 hours |
|
|
| Summary: |
66 hours |
106.8 hours |
82% Complete |
- Notes:
-
Project scope has changed significantly after further project
planning. Backups and disaster recovery will now much more
easily facilitate archiving of data down the road.
Code written by Paul needs further work but will require a programmer
to do work. Considering options to revise plan.
- Project Title:
NTP diagnosis and update
- Project Abstract:
-
NTP is a protocol used to automatically set system clocks using a
standard time keeper. While the initial set of the clock seems to work,
there is an unexplained drift that occurs over time.
- Projected Start Date:
- February 10, 2003
- Projected Completion Date:
- Unix: February 14, 2003 (although long term testing required)
Windows: Unknown
- Major Steps:
-
For Unix/Linux Systems:
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Diagnosis |
1 hour |
0.3 |
100% Complete |
| Testing |
1 hour |
Seconds a day to check |
100% |
| Pushing Solution to all systems |
1 hour |
1.6 |
100% Complete |
| Summary: |
3 hours |
2.2 hours |
100% Complete |
- Notes:
-
Appears to be working. Will continue to glance several times a day to
assure it is still working over the next couple of months.
Confident at this point the fix is done.
For Windows Systems:
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Initial Installation |
2 hours |
0 |
|
| Testing |
1 hour |
|
|
| Pushing Solution to all systems |
3 hours |
|
|
| Summary: |
6 hours |
0 hours |
100% Complete |
- Notes:
-
Windows implementation has been passed off to Neil as a secondary
project on his todo list.
- Project Title:
LDAP Security
- Project Abstract:
-
Move ISRL and Prairienet LDAP implementations from a first stage
implementation to stage two of the implementation. This move entails
enhancement of security (e.g., restriction of unauthenticated requests;
explicit exiration of inactive accounts) and reliability (e.g.,
hot failover in the event of system crashes).
- Projected Start Date:
- February 17, 2003
- Projected Completion Date:
- Hot failover: March 7, 2003
Access Control Lists: Original: February 28, 2003; Revised: March 13, 2003
ISRL Account Expiration: Fall/Winter 2003
- Major Steps:
-
Hot failover
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Investigate SSL issues |
8 hours |
1.5 hours |
100% Complete |
| Investigate Email issues |
Unscheduled |
4.25 hours |
100% Complete |
| Test configurations |
1.5 hours |
1.7 hours |
100% Complete |
| Generate and push production config |
4.5 hours |
6 hours |
100% Complete |
| Monitoring System integration |
6 hours |
0.2 hours |
100% Complete |
| Summary: |
20 hours |
13.5 hours |
100% Complete |
- Notes:
-
Access Control Lists (ACLs)
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Research and Development/Testing |
8 hours |
6.25 hours |
100% Complete |
| enable client access |
4 hours |
4 hours |
100% Complete |
| Patch current DSG/scripts |
6 hours |
4.2 hours |
100% Complete |
| revoke anonymous access |
1.0 hour |
3 hours |
100% Complete |
| Testing |
1.5 hours |
1.5 |
100% Complete |
| Summary: |
20.5 hours |
12.75 hours |
100% Complete |
- Notes:
-
ISRL Account Expiration
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Project Planning |
4 hours |
1 hour |
25% Complete |
| Policy generation |
4 hours + Peer Review and Revision |
|
|
| System Design |
8 hours |
|
|
| LDAP scripts |
4 -8 hours |
3 hours |
50% Complete |
| * Archiving scripts |
0 hours or 10 hours |
|
|
| * Web configuration |
0 hours or 2 hours |
|
|
| Summary: |
20 to 36 hours |
4 hours |
20% Complete |
- Notes:
-
* If implemented, these tasks will have a per expired account
cost (from 1-4 hours admin time?).
- Project Title:
LDAP Web Authentication
- Project Abstract:
-
To develop a standard authentication gateway for system and other
web services using the LDAP database. This reduces development time
for system interfaces and expands
the scope of services that can be offered. It enables end users to
quickly deploy secure authenticated web pages.
Further, it eliminates the need to re-authenticate for each web transaction.
Combined with the Directory Server Gateway project, it also allows
smoother delegation of authentication/authorization database account
maintenace responsibilities.
- Projected Start Date:
- March 3, 2003
- Projected Completion Date:
- Original: March 21, 2003; Revised: March 28, 2003
- Major Steps:
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Project Planning |
1 hour |
3.3 |
100% Complete |
| Software Research & Design |
12 hours |
10.4 |
100% Complete |
| Software Coding |
8 hours |
4 hours |
100% Complete |
| Software Testing |
1 hour |
2 |
100% Complete |
| Documentation and examples |
4 hours |
2 hours |
80% Complete |
| Summary: |
32 hours |
21.7 hours |
96% Complete |
- Notes:
-
- Project Title:
Updates to Email System
- Project Abstract:
-
The current email system is in phase two of three planned phases.
The current project would move the email system to phase three. This
entails additional user flexibility:
- enabling integrated quotas
- email tagging and filtering
- user level interface to delivery tools (such as vacation and
procmail)
- interface to mailing list/email archives
Migration of users to phase three entails:
- Conversion of mbox to maildir formats
- Integration of existing forward/procmail configuration files
- Projected Start Date:
- March 24, 2003
- Projected Completion Date:
- Original: May 9, 2003; Revised:
- Major Steps:
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Project Planning and enumeration
| 12 hours
| 7 hours
| 75% Complete
|
| Frontend Design
| 12 hours
|
|
|
| Frontend Coding
| 8 hours
|
|
|
| Frontend Testing
| 4 hours
|
|
|
| Package testing and decision
| 30 hours
| 8 hours for mailman
| Mailman: 95% Complete
|
| Implement packages (ie. procmail)
| 3 hours/package
| 3 hours for mailman
| Mailman: 90% Complete
|
| Bootstrapping scripts
| 48 hours
|
|
|
| Documentation
| 8 hours
|
|
|
| Summary: |
125 hours |
18 hours |
20% Complete |
- Notes:
-
Project is currently on hold during the design phase of the LDAP Directory
Server Gateway project, upon which much of the remaining implementation of
the email upgrade is dependent.
Project design will also vary depending on the level of integration with
the GSLIS mail server and the resulting feature sets needed
- Project Title:
LDAP Directory Server Gateway
- Project Abstract:
-
Create a set of web pages that allow authorized personnel a simplified
front end to LDAP for account and group management.
- Projected Start Date:
- April 1, 2003
- Projected Completion Date:
- July 25, 2003
- Major Steps:
-
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Planning
| 13 hours
| 20 hours
| 40% Complete
|
| Feature review and catalog
| 40 hours
|
|
|
| Needs assessment
| 24 hours
|
|
|
| Software Design
| 112 hours
|
|
|
| Software Coding
| 112 hours
|
|
|
| Software Testing
| 9 hours
|
|
|
| User Documentation
| 5 hours
|
|
|
| Summary: |
213 hours |
20 hours |
10% Complete |
- Notes:
-
Increased planning has focused on combining resources and objectives
with other GSLIS groups and also with CITES as a means of reducing the
overall costs for development.
- Project Title:
Security and Reliability
- Project Abstract:
-
Develop tools for monitoring and tracking systems resources in a
quantifiable manner. Resources to be tracked/monitored include:
- UPS status
- systems status
- server load
- log sanity
- uptimes
- services accessibility (web, ldap, mail, printing, modems, ...)
- network bandwidth
- disk space
Reporting and control structures include status pages (e.g., network,
systems) and quota mechanisms.
- Projected Start Date:
- June, 2003
- Projected Completion Date:
- Monitoring and tracking: August, 2003
Reporting and control structures: Winter, 2003
- Major Steps:
-
Monitoring and Tracking
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Project planning |
8 hours |
|
|
| Implementation planning |
9 hours |
3 hours |
|
| Software design |
8 hours |
|
|
| Software coding |
20 hours |
2.25 hours |
|
| Software testing |
3 hours |
|
|
| Documentation |
6 hours |
1.5 |
|
- Notes:
-
Reporting and control structures
| Step:
| Estimated Time:
| Total Time Spent:
| Status:
|
| Policies and procedures |
6 hours |
|
|
| Software design |
16 hours |
|
|
| Software coding |
24 hours |
|
|
| Software testing |
3 hours |
|
|
| Documentation |
8 hours |
|
|
| Bootstrap control structures |
12 hours |
|
|
- Notes:
-

isrl-support@isrl.uiuc.edu