Tuesday, January 22, 2008

Basic Os Concepts

Basic Operating System Concepts:

All Operating systems have some common functions. The older the OS, the fewer functions. Figures 5.1 thru 5.5 show several views of 'the operating system', with schematics showing the perspective from: User Interface; RAM; the *ix Shell visualtization; a Windows GUI shell; & with a generic 'command processor' loaded in RAM. The text aggregates the functions into five basic components: User interface, device management, file management, memory management, processor management. Here's a system analyst's template that pretty much summed up the devices and media used in systems of the 60s: punched card & magnetic tape were used copiously; 'display', magnetic disk & drum, and 'terminal interrupt' were used less often. Those 'manual process' blocks got heavy workout, usually meaning that some employee needed to stand in front of a card sorter & tend 'gang punches' and other heavy machinery used to read and punch records onto cards.

From the mid-70s onward, a template this size can't hold symbols for all the i/o and other devices of which the modern OS must be 'aware': mouse, control keys, USB, CD, DVD, ZipDisk, game controllers, serial ports, ethernet, &c, &c have been interfaced with our popular OSs that make new technology easily accessible to application developers and users.
Standardization in how the devices' controllers work often allows newer, more efficient technology to be deployed without any changes in the OS's device drivers. Examples of this are seen in devices that attach to PCI bus, IDE controllers, SCSI, USB, and other hardware interface interfaces in use today. A modern OS goes well beyond static textbook schematics in complexity, and when multiprogramming techniques like SMP are added into the mix the OSs operations become so complex that schematics can't show all that's going on and it's hard for us to draw a picture of what's going on. There's an old axiom about this: 'the computer doesn't have to draw lines'. Instead, computers use data structures like stacks, linked lists, 'process status words', and other techniques to keep track of it all. OS developers must be able to visualize, or otherwise make a 'mental map' of, the data structures and instructions that were hinted at in the last chapters, when they write the memory & processor management schemes that they can tailor the OS to service each device supported by their platform.
Here are important concepts from the text:
The User Interface (1) is the most important OS component as far as we _users_ are concerned. It might be graphical, or might be a character-based. Most systems today provide a mix.Figure 5.1 shows what the UI makes available for a system's users. That Device Management (2) box is right full these days. File, memory, and processor management functions (3, 4, & 5) round out the 'classic' OSs 5 components, and each is optimized for the particular platform at hand. In this figure, the text puts the UI at the top of the hierarchy, and shows the four other OS functions under the UI.
Another important OS interface does _not_ interface with the User. Application Program Interfaces (APIs) are alternate interfaces that software developers provide so that _programs_ can use programs as well as users. For example, Microsoft's Excel provides APIs so that a script written in VisualBasic or VB.NET can easily open an Excel workbook and read/write on the worksheets, using any formulas or macros that may be in the worksheets' cells.
The Shell interfaces the computer with that most complex of 'devices', the User. A shell accepts users' commands in the format and syntax of its 'command language' (which might include mouseclicks) and can give some kind of error message if an improperly formatted or unknown command is issued, or if the user doesn't have sufficient privilege to use the command. These days, shells are either 'character based' or 'GUI' (Graphical User Interface), as in Windows, a Linux workstation running Gnome or KDE, or a Mac.
A character-based shell generally uses a 'command line' interface where the commands typed are generally the name of a program to load and run. A Linux 'console session' uses a command line interface, and may be entered on a PC's monitor, or remotely from another Linux machine or a Windows machine running 'terminal emulation software' like putty.exe or Windows' secure shell client.A prompt ($, %, # are common in unix) lets the user know that the system is waiting for their command. Users type in their command a character at a time, perhaps assistied by 'tab completion' or 'scrolling thru a command stack', and commands are 'submitted' to the shell when the user hits the Enter key. On Windows machines, the norm is to use the GUI, discussed next, but there are occasions in system or network management when a command line interface is more desirable than the GUI. Windows 95 & 98 continued to provide DOS 'underneath' Windows and the icon to get to a Windows command line is labelled 'DOS Command Line' (needs verification). On XP, there is no more DOS, per se, although DOS shell commands and scripts are generally supported. Now, the command line appears when you choose the 'Command Window'.
A graphical-based shell (GUI) adds options where the user can double-click on an icon, or click on a link or menu choice, or use a 'keyboard shortcut' to call up a program in these 'object oriented' environments. A command line is available, but most users use the GUI. When a choice has been made, the OS looks at the properties of the icon or shortcut to discover the 'target' OS function or program to load and run for the user. (In Windows, right-click on an icon to see its properties. The target will likely be a full path to an executable or batch file that the OS can run.)
Commands might be 'built into' the OS (, or they might be files kept in directories on the user's 'path'. For example: in Windows the 'copy' command is built into the OS and doesn't have a separate file for the command; the 'format' command's code is kept in c:\Windows\System32 along with other external commands. In Linux, practically all the commands are external and the most commonly used are in /bin, with those likely to be used by a super user kept in /sbin. When a command has been issued by a user, the OS first checks to see if the command is built into the OS, as many of the most primitive functions are. If it is, it's executed instantly.Otherwise, it looks on the user's 'path', in order, to find a match. The path is part of a user's 'environment' and the OS starts searching for external commands at the beginning of a user's path and executes the first match it encounters.In windows, open a dos or command prompt window and type 'path' to see the path that you have, or your systems administrator has, set up for you. In Linux, type 'set' to see all the setting for your environment, and find the PATH line.We'll take a side-trip in class looking at commands, and will make new commands by editing 'batch files' on Windows XP and Linux platforms.
Device Management functions allow the computer to communicate with a platform's 'peripheral' devices like disk drives, network interfaces, serial ports, or devices on a SCSI bus or USB. Notice that memory & processor are not peripheral, they are at the center of the CPU's influence, and get their own OS functions. The best news about device management is that it is practically automatic on popular platforms, or they wouldn't be popular. Windows and desktop Linux distributions have processes (kudzu is one in Linux) that recognize that new devices have been added, or that old ones have been removed, and automatically adjust the OS configuration and load drivers to accommodate the changes. In Windows, the Device Manager is used to show and configure devices attached to the system. We'll look at this briefly in class. Linux keeps track of all its devices (and processes) in the /proc and /dev directories, and super users can use various programs, similar to Windows' Device Manager, to configure interfaces or load drivers for printers and other devices that need them. We'll look at some of these, too.Along with other descriptive stuff, the text addresses two concepts I'd like to discuss further: IOCS (I/O Control System), Logical vs. Physical I/O, and Interrupts.Desktop PCs used to support Windows & Linux platforms have a BIOS (Basic I/O System) to handle the 'basic' devices used in the bootup process: keyboard, mouse, disk, & monitor. The BIOS is a relatively simple program burned onto a ROM or EPROM, that starts running when the mainboard is first powered up, or when it is reset. The BIOS has enough intelligence to display bootup progress and watch the keyboard for a user's keystrokes like 'delete key' or 'F8' that have significance for the BIOS at hand. The BIOS knows how to find a 'bootable device' (which may be a CD, floppy, or hard disk) and start the software on that device, which is usually an OS like Windows or Linux that can handle more of the less basic I/O. An IOCS is made up of OS code with primitive commands to communicate directly with controllers for all the peripheral devices likely to be attached to a platform. This way, application programmers can usually write their programs to issue 'logical requests' for I/O without having to know the primitives that handle 'physical I/O' on the devices their code accesses.Think back to the 'cylinder, surface, sector' organization of blocked data on a hard disk. The portion of the IOCS that handles disk access knows how to take a program's open & read statements (logical I/O), convert them to the primitive commands understood by the disk controller, and move blocks of data among disk to RAM (physical I/O) where they are available for the script that is being executed.
Interrupts are signals passed to a CPU to let it know that a device or a process needs its attention. Hardware and software can 'raise an interrupt' that is (as) immediately (as possible) processed by a CPU according to the protocol that has been established for that platform. For example, when network traffic arrives over a LAN the NIC buffers the incoming traffic and 'raises an interrupt' so that the CPU can process the data. Whenever you touch a key on a PC's keyboard an interrupt is generated so that the OS can handle the keystroke for you.'Hardware Interrupts' are some of the most limited 'real estate' on any platform since each is represented by a trace on the bust. An intel 8086 processor only provided 16 hardware interrupts, IRQ0 thru IRQ15. Here is a listing of hardware and software interrupts for the 8086. It was easy to have 'conflicts in IRQ settings' on PCs with lots of interface cards in them. Today's Pentiums have more, and looking at the control panel, system folder, device manager, and choosing view by type or connection will show how the 24 (?need to verify this) interrupts are assigned on your PC. Also, the PCI bus and BIOS can do some 'IRQ Steering' automatically to keep network admins from pulling out their hair.There are many more 'software interrupts' available than hardware, as you may notice in the above listing.Interrupt processing is relatively 'expensive' for a CPU since whatever the CPU is doing is literally interrupted when an interrupt is received. When multiple interupts are generated at nearly the same instant they are queued.This is a great simplification of 'interrupt processing': the OS 1) saves the status of the 'current' program it is running and may have to save any data in the CPU that is associated with it, 2) transfers control to an appropriate I/O routine for the device on that interrupt, 3) handles the interrupt request, and 4) after the interrupt is serviced the current program is reloaded and processing continues. How and where interrupts are processed varies depending on a platform's configuration. For example, a desktop PC running Windows passes an interrupt to the CPU for every keystroke a user makes. Keystrokes made in a system running on a host minicomputer with dedicated 'terminal I/O controllers' interrupt an I/O controller, which responds by storing the keystroke in a buffer and echoing it to the users' displays. Most of these 'intelligent I/O controllers' can handle more complex tasks, like the backspace or other edit keys. The processor on the I/O controller interrupts the CPU only when users hit their Enter keys. Then the contents of the buffer for that keyboard are transmitted all at once to the CPU when it acknowledges the interrupt. Disk, network, tape, and other controllers on larger machines are likely to have dedicated microprocessors that perform similar to an intelligent keyboard controller. They can handle interrupts from the devices on their channel until everything has been queued up to send along to a CPU, which only gets one interrupt instead of dozens or hundred handled by the peripheral controller.
File Systems are associated with platforms. A particular platform may support several files systems that may sound familiar, and some platforms can accommodate 'foreign' file systems. Linux distributions currently use the ext3 file system, but there are other options available to a system administrator when a disk drive is being formatted. Windows' single user OSs have used FAT (File Allocation Table), and more lately FAT32 as their file systems. Windows' servers use NTFS, which adds features needed to secure and backup files in a multi-user environment. A 'hybrid' Windows like XP Pro allows disk partitions to be formatted as either FAT32 or NTFS, and some users keep the FAT because it's sometimes faster since it doesn't have as much to do. Longhorn will come with an entirely new file system when it's released that will facilitate the searching and browsing we'd like to see easier and faster .File system management functions extend at the 'logical side' of the IOCS, and allow the OS and application programs to reference data files by 'directory path' and 'name' and leave all the physical & logical i/o to the OS. Directories, or folders, are a common feature on most OSs. A 'disk directory' forms the 'root' of a 'tree structured' directory in most of them. The file system knows how to search this tree very quickly, finding a file by name and returning the drive, partition, cylinder, and sector where a file starts.A multi-user file system must be able to handle issues of 'file locking' or 'record locking' so that it is clearly evident when a file or record sought by one user is being updated by some other user. This isn't a problem for a single-user system like Windows98 running on a desktop PC.Linux/Unix extends the 'file system' concept to almost any device that can be attached to it can be opened, read from, and written to without too much concern by the application programmer. They also provide for the creation of 'named pipes' and 'sockets' via system bus, LAN, or The Internet so that programs, running on one machine or many, can easily move data among themselves without programmers having to be concerned with the physical i/o involved. Linux divides all devices into two classes: character devices handle 'streams' of data, like keyboards, usb, or a video camera; block devices transfer data in blocks, like disk, tape, or ramdisk (areas of RAM formatted like a file system). Block devices always transfer data in fixed-sized blocks. A tape drive block might be 16,384 bytes. A disk might be formatted for blocks of 512, 1024, 2048, or more bytes.Character devices transfer data one byte at a time and use some signal specified in the devices protocol to indicate where one data structure begins and another ends. Keyboards typically use ascii character 12 (carriage return, generated by the enter key) to signal the end of a line of text. Video devices, scanners, and other character devices use signals expected by their drivers. Programmers writing applications for the devicesSince all Unix devices are defined in a directory in 'the file system' (/dev) programmers can reference most devices in their code using similar notation. A data file might be named like /home/AStudent/web/index.html. A tape drive might be named like /dev/st0 or /dev/st1, literally 'SCSI Tape zero' and 'SCSI Tape One'. A serial port might be named like /dev/ttys1 or /dev/ttyS1, depending on how the device attached to it is confgured. For most purposes in Unix or Linux, the same 'open' and 'read' statements are used to get data from any of them into RAM where a program can use them.
The term Boot Process comes from the old adage: lift yourself up by your own bootstraps. 'Bootstrap loaders' are common on most platforms today. We just hit the power switch and the machine 'boots up.'In the good old days, a systems administrator for something like a DEC PDP-8 started one of these big beasts by hitting the power switch, but then nothing else happened. We had to toggle a 'starting address' into switches on the computer's chassis before hitting the 'load & run' button. The starting address might be for a tape drive if a new OS was being installed from a tape, or the address of the controller for the disk drive holding the OS for an ordinary day's run. (I stuck a red dot on the switches that needed to be pressed so I could talk somebody easily through the process if the machine had to be rebooted in my absence.)Today, boot processing is mostly automatic on most platforms, and on a Windows or Linux-based PC it is handled by the BIOS. The BIOS on a mainboard for an Intel processor lets the user set and save a 'bootup sequence' that will be followed whenever the machine is powered up or reset. It was common a few years back to keep it set to look for a 'bootable diskette' in A: first and boot with it if found. Then, C: (the hard drive) would be checked if the floppy drive is empty. Today, the sequence is more likely to go: CD/DVD, then floppy (if at all), and then C:. Either of these allow the user to 'boot to a floppy' or CD that will install software or hardware, or perhaps pillage the machine, without having to load the complete OS first.
Utilities are programs that are distributed with the OS to handle routine file system tasks. In Windows these are scattered among the accessories and system tools folders, and are used to backup, defrag, or format disks, or maybe recover 'lost' data. In Linux, there are several utilities for backing up data (cpio, tar, cc, & backup) depending on the requirements. Some utility programs may be purchased, perhaps because the do a better job than the one that distributed with the OS, or because the OS doesn't provide that utility. In Windows, for example, many systems administrators use Veritas or other 3rd party software to do backups of data since the backup utility in Windows is slow and clumsy to use. In Linux, a systems administrator who needs to backup a lot of machines might purchase software, or find an open source solution like amanda to automate the process.Windows users are used to buying virus scanners, which can be thought of as utility programs, to make up for the lack of one in Windows.
Memory ManagementThe text does a good job of outlining and detailing these concepts. Read Chapter 6 and ask questions as needed.Important basic terms are: resident vs. transient routines; concurrency, partitions & regions; segmentation, address translation, displacement, paging, memory protection.Overlay structures are important for putting more code thru memory than there is memory to handle it. Virtual Memory systems in multi-processing OSs (most of them today) use techniques like these along with 'paging' & 'swapping' that allows contents of 'real memory' to be 'paged in & out' of areas set aside on disk for the purpose. In a large host or server, paging is inevitable as the number of users increases. On a desktop system, it becomes a problem when 'too many windows are open' and we notice everything runs a _lot_ slower. An important practical matter about paging is that 'real memory' access is 1000s of times faster than 'virtual memory' access. Systems that have to do a lot of paging are slower than those that don't. Multi-entrant coding techniques are used to minimize paging and help keep system response snappy. Big mainframes can control swapping between memory and disk by using RAMs that are gigabytes or terabytes in size, but smaller CPUs don't have this advantage quite yet. Maybe next year, as the 64-bit Itanium and Athlon processors advance, we'll see huge RAMs on our PCs.Where paging is excessive it's referred to as 'thrashing'. Couple this thrashing with the interrupts generated by several users hitting a host's keyboards and the users will be posting those cartoons of a skeleton covered with spider webs and the caption "How's the system response time today?" This provides a good argument to upgrade the platform to one that can support enough memory & dedicated i/o processors to handle the 'user load.'
Multi-programming is common today on most platforms. Since Windows95 PCs have been able to run more than one program at a time. Before that, a PC user could have several programs 'up' at the same time, but only one at a time would actually be running. Now, we're used to having a few windows open and seeing that they're all running pretty much OK all the time.Multi-user system have to do this multi-programming in spades. They may have to keep track of thousands of users' processes and give the illusion that each user has access to the whole system at the same time.The text does a good job of explaining how The Dispatcher, Control Blocks, and Interrupts work together to support multi-processing.
Time-Sharing platforms are geared to providing the shortest possible 'response time' to the largest number of users. They use techniques like roll-in/roll-out, time-slicing, or polling to divide CPU attention among the users. Minicomputer and mainframe platforms with their multi-channels and dedicated i/o processors excel at time-sharing applications and may provide thousands of users with subsecond response time as they work on the system.
Spooling is also common on most platforms today. The most visible application is the 'print spooler' that is provided on most OSs. When we print a Word document in windows, it doesn't 'go directly to the printer', but is first written to disk & then copied to the printer. The windows spooler is manifested in the Printers window, where a list of 'spooled jobs' will collect if the target printer is off-line because of a paper jam or other problem. Large, multi-user platforms may have to spool hundreds or thousands of print jobs among dozens or hundreds of printers. They need to have 'industrial strength' interfaces to the print spooler so they can solve problems by redirecting print jobs, or perhaps canceling one that goes wild. (I can relate that many performance problems and system failures have been caused by print jobs that 'run wild' and consume too much, or all the available disk space!)Spoolers also hold other types of data 'in limbo' on disk until they're accessed by their owners. In Linux, for example, /var/spool/mail holds users' email until they use an email client to move the email into their own 'folders'.
Deadlock is, hopefully, less and less common as platforms' performance increases and larger, faster machine become more affordable. When a system, or one of its components, is overwhelmed is might not be able to handle the user load. A disk that is too busy paging won't have time to retrieve programs or data needed by users -- this might be avoided by designating a separate disk for paging. Multi-user OSs are programmed to avoid common causes of deadlock, such as more than one process trying to access the same disk drive at the same time.Recently, I've heard the term used to describe what happens when too many users try to use an Access Database (designed for single user access) at the same time. The solution here is to get a _real_ DBMS like SQL Server and stop trying to abuse Access.Deadlock can sometimes be solved by 'throwing hardware' at the problem, providing adequate resources for the users.
Network Operating Systems (NOS) include device drivers for network hardware, like NICs (Network Interface Card), and they include software to send and receive data using various network protocols. NOSs also include functions to 'authenticate' users & other machines, allow them to access system resources for which they are 'authorized', and deny access to unauthorized users.Before discussing NOS, here are a few paragraphs about their precursors, 'terminal emulators'. Many networks today use terminal emulators to get access to host computers and/or servers made available to users of the network. Before LANs and NOSs came on the scene, host computers, mainframes and minicomputers, had for decades been about the business of authenticating & authorizing users and allowing them to access application software that ran on the host to let them access data stored on the host's devices.As soon as PCs starting sitting on desktops there was a need to connect them to some other computer. In the early '80s I recollect lots of PCs taking up space on desks along with a 'terminal', or 'dumb terminal' used to get into some host mainframe or minicomputer. It didn't take long before software came along to save some of the limited space on desks: 'terminal emulation software' and maybe an interface card would be added to a PC so that it could 'emulate', or 'behave like' the dumb terminal it would replace. For example, in the School of Business, IRMA software and an interface card with a co-axial connector were added to PCs and the IBM 3270 terminal could be taken away. This way the PC could do all the PC stuff, like word-processing & spreadsheeting, and could also be used as a terminal on a host computer.To attach to one of the many different 'minicomputers' of the time was even easier. Most of these used a somewhat standard 'serial interface', the typical PC had two 'serial ports' on its backside, and all that was required to hook the PC up to a host minicomputer was terminal emulation software that would use the serial port for I/O with the minicomputer. Minicomputer manufacturers and third-part software houses provided terminal emulation software that allowed a PC to replace the terminals required. This allowed a single PC to attach to one, or a few, 'host computers'. The host at the center of this star topology might, or might not, have functions to let the PCs share files or other resources. Many of them did, and for most intents and purposes, offices rigged this way had 'a network' of PCs with a minicomputer at the center.
Post a Comment