From PGS%MIT-OZ at MIT-MC.ARPA Fri Jul 27 17:50:00 1984 From: PGS%MIT-OZ at MIT-MC.ARPA (PGS%MIT-OZ at MIT-MC.ARPA) Date: Jul 27 1984 11:50 EDT Subject: front end 11 In-Reply-To: Msg of 27 Jul 1984 03:53-EDT from Christopher C. Stacy Message-ID: Date: Friday, 27 July 1984 03:53-EDT From: Christopher C. Stacy MC stopped answering the dialups in the afternoon, and when I came in long afterwards the system was apparently "dead"; it was not answering anything. It appeared to be running though. Unfortunately the front-end 11 seems to have dropped dead; I could not get to DDT or even 11DDT. I used the disk bootload button to get the 11 back, and stopped and dumped ITS to CRASH;ITS 11HUNG in case anyone wants it. Then I cold booted. Don't blame me. I wasn't installed. From CSTACY at MIT-MC Fri Jul 27 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 27 July 1984, 00:00 Subject: front end 11 Message-ID: MC stopped answering the dialups in the afternoon, and when I came in long afterwards the system was apparently "dead"; it was not answering anything. It appeared to be running though. Unfortunately the front-end 11 seems to have dropped dead; I could not get to DDT or even 11DDT. I used the disk bootload button to get the 11 back, and stopped and dumped ITS to CRASH;ITS 11HUNG in case anyone wants it. Then I cold booted. From ALAN at MIT-MC Wed Jul 25 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 25 July 1984, 00:00 Subject: MC params In-Reply-To: Msg of 25 Jul 1984 16:41-EDT from J. Noel Chiappa Message-ID: Noel's message reminds me... Last night I moved the various notes and things that used to be taped to the inside front door of the first MF10 into the first MH10. Now most of that information actually has little to do with the MH10s, but I figured it was usefull to know what information we -used- to keep there in case anone ever decides to post some modern truth in that location. (I marked them as out of date in a big red pen.) From JNC at MIT-MC Wed Jul 25 00:00:00 1984 From: JNC at MIT-MC (J. Noel Chiappa) Date: 25 July 1984, 00:00 Subject: MC params Message-ID: Since the whole Ampex seems to be online now, next time someone rebuilds the system they should change CONFIG > to allow MC to use all of the 2MW of memory on the system. Right now half the AMPEX is not being used. From KLH at MIT-MC Fri Jul 20 00:00:00 1984 From: KLH at MIT-MC (Ken Harrenstien) Date: 20 July 1984, 00:00 Subject: ITS seemed to be looping Message-ID: Well, I am too tired to pursue it further tonight, but the fragments appear to have come from host GYMBLE by way of BBN-MILNET-GW; it had several SMTP connections open at the time. This doesn't help find the bug (for that it takes a lot of grovelling through the datagram buffers) but may serve as a warning signal. From KLH at MIT-MC Fri Jul 20 00:00:00 1984 From: KLH at MIT-MC (Ken Harrenstien) Date: 20 July 1984, 00:00 Subject: ITS seemed to be looping Message-ID: Ignore my query about ITS version to :SL. I've been doing too much Unix ADB'ing and forgot ITS does the right thing as usual. From KLH at MIT-MC Fri Jul 20 00:00:00 1984 From: KLH at MIT-MC (Ken Harrenstien) Date: 20 July 1984, 00:00 Subject: ITS seemed to be looping Message-ID: Keep CRASH;WHAT HEY around for a while. If your PC trace is right, it was looping in the IP fragment re-assembly code, which has in the past been a notorious source of bugs (since it is both very hard to understand and very seldom exercised). Can you tell me what version of ITS it was running, so I can :SL the symbols from the right file on .; (and please continue to take more crash dumps any time this happens, since it is unlikely that the bug can be determined just from one example). By the way, it makes sense that clock interrupts would be off, since this code runs at IMP interrupt level. Oh boy, time to play with PEEK autopsy mode again! From Moon at SCRC-STONY-BROOK.ARPA Tue Jul 17 05:06:00 1984 From: Moon at SCRC-STONY-BROOK.ARPA (David A. Moon) Date: Jul 16 84 23:06 EDT Subject: Ampex memory Message-ID: <840716230650.0.MOON@EUPHRATES.SCRC.Symbolics> Some (most? all?) of the ports on the Ampex memory don't work, I think because Ampex's transceiver cards for the DEC memory bus are flakey. That's probably why the cabling is weird. The interleaving options are all very confusing, especially since the cases that are not supposed to work actually do seem to work. I talked about this with CStacy this afternoon some. I think the bottom line is that you can put the system into 4-way interleave mode if all of ports 4 through 7 (or wherever the processor is plugged into) on the Ampex work, but if any of those ports are broken and can't be used you have to use 2-way interleave mode so the Ampex will still see all memory requests. It's also unclear whether 4-way mode is better since it should be only a few percent faster in the DEC memory (if I remember a measurement I made in about 1976 correctly) and should be substantially slower in the Ampex memory, since it can only do one operation at a time (two at a time when both sectors are working). MC:MOON;PERF > will measure various things but I don't know how you'll get the numbers for the MF10 configuration to compare against. From JNC at MIT-MC Mon Jul 16 00:00:00 1984 From: JNC at MIT-MC (J. Noel Chiappa) Date: 16 July 1984, 00:00 Subject: new memory Message-ID: The MH10's are installed. They are 8 port (dual controller) boxes, so KBUS0-3 are bussed through all four boxes on port 7-4 (and thence to the Ampex ports 7-4). The DL10 is port 0, the RP10 is port 1, and the TM10 is port 2. (Don't ask why it's different from the Ampex). The TM10 bus is terminated on the last DEC box, as before, but the cable to the AMPEX is right there if anyone wants to plug it in for some reason. The MH10's are currently in 4-way interleave in the hardware. It looks like the processor is running two way interleaved; now that we have this wonderful working 4-way memory maybe we could switch back? Also, the AMPEX just got a write parity error on port 5. if this happens a lot someone should think about reseating the cables, etc. Finally, is there some sort of meter we can use to see if the machine is running faster? From ALAN at MIT-MC Mon Jul 16 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 16 July 1984, 00:00 Subject: No subject Message-ID: CRASH;10JUL CRASH just happened again in exactly the same way. A comsat RESTRT job tried to rename STATS < to OSTATS >, and on the way out of the call it was noticed that there were still some locks locked. From ALAN at MIT-ML Sat Jul 14 00:00:00 1984 From: ALAN at MIT-ML (ALAN at MIT-ML) Date: 14 Jul 1984 00:00 Subject: No subject Message-ID: After remaining up for 13 days without even a memory error, ML finally crashed just now because of a software bug. At QSOC3E+15, on its way out of the system after trying to do an MLINK, an entry in QSNNR went negative. A dump of the same problem was taken by GSB a few weeks ago and can be found in MC:CRASH;QSOC3E +17 at ML if anyone want to look for it... From ALAN at MIT-MC Sat Jul 14 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 14 July 1984, 00:00 Subject: ITS seemed to be looping In-Reply-To: Msg of Sat 14 Jul 84 02:43 EDT from David A. Moon Message-ID: Date: Sat, 14 Jul 84 02:43 EDT From: David A. Moon Things to try: Raise switch 0 (the switch 0 on the left). If this goes to DDT, it's taking clock interrupts. Yeah, I forgot to mention that I tried that, and nothing happened. Hit Break and type PC . If I remember correctly, you can read the PC this way without halting the machine. There are some other status-type commands; PCF is the PC flags, PI is the interrupt status. OK, well it happened again which gave me a chance to try out the things I remembered from this message (wish I had made hardcopy as soon as I got it!). Repeated applications of PC showed it looping in the general vicinity of IPRD61 (assuming it was looping in the system, which is reasonable to assume if it isn't taking clock ints!). This is a routine in INET which looks to me like it might well loop given the right gubbish in memory. Perhaps with that information someone can take a look at the crash dump I took (still in CRASH;WHAT HEY) and figure out what caused this. From ALAN at MIT-MC Sat Jul 14 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 14 July 1984, 00:00 Subject: Its a mystery to me. Message-ID: How come when a job gets a parity error its superior DDT always gets a %PIB42? DDT seemingly get it simultaneous with the interrupt from the inferior, but thats what you would expect of a real B42, right? There doesn't seem to be anything wrong with what DDT is doing as far as I can tell... From Moon at SCRC-STONY-BROOK.ARPA Sat Jul 14 08:43:00 1984 From: Moon at SCRC-STONY-BROOK.ARPA (David A. Moon) Date: Jul 14 84 02:43 EDT Subject: ITS seemed to be looping Message-ID: Things to try: Raise switch 0 (the switch 0 on the left). If this goes to DDT, it's taking clock interrupts. Hit Break and type PC . If I remember correctly, you can read the PC this way without halting the machine. There are some other status-type commands; PCF is the PC flags, PI is the interrupt status. Hit Break and type SP . This stops the machine cleanly (between instructions). If this works, the microcode isn't looping. Now you can get the PC then type DDT (or ST 774000) to get into DDT and decode that PC. If the microcode is looping the SM command will restart it. This also does nasty things like resetting the I/O bus. I think it preserves the PC though. There is a command file, J KLHUNG, which prints out everything in sight. 90% of what it prints is worthless, but it includes micro and macro PCs. I believe there is a piece of paper taped to the machine that tells you to do J KLHUNG. Of course there are a lot of pieces of paper taped to the machine! From ALAN at MIT-MC Sat Jul 14 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 14 July 1984, 00:00 Subject: No subject Message-ID: ITS died in a new way (for me) just now. The immediate symptoms were what I would expect if the microcode was looping. (No lights on any box were blinking, nothing was visibly doing anything, the system console had not printed anything to indicate anything was wrong.) I thrashed around a bit trying to find the KLDCP documentation to see if I could learn anything about what the processor was doing. When Taft found me a copy I was frustrated enough that I just took a crash dump (yeah, I know, thats probably worthless, its in CRASH;WHAT HEY) and reloaded it. So what should I have done? From CSTACY at MIT-MC Wed Jul 11 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 11 July 1984, 00:00 Subject: ml doesn't get updates In-Reply-To: Msg of 11 Jul 1984 23:16-EDT from Pandora B. Berman Message-ID: Aha, ML had the old version of INQUIR so that updates from MC went to ML, but ML didn't update itself! I installed the new one. From CENT at MIT-MC Wed Jul 11 00:00:00 1984 From: CENT at MIT-MC (Pandora B. Berman) Date: 11 July 1984, 00:00 Subject: ml doesn't get updates Message-ID: Date: 9 July 1984 16:02-EDT From: Christopher C. Stacy Subject: ml doesn't get updates To: CENT @ MIT-MC cc: BUG-INQUIR @ MIT-MC, BUG-ITS @ MIT-MC In-reply-to: Msg of 9 Jul 1984 06:18-EDT from Pandora B. Berman ML should be getting updates; when it came back up I put it back in the list of machines to update and tested it. It worked for me; I'll look to see what is going wrong. i looked at the relevant STATS files. running INQUIR on ML creates a piece of mail to INQUPD at MC; this reaches MC and is only dealt with locally. at neither juncture is an INQUPD performed on ML. From ALAN at MIT-MC Wed Jul 11 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 11 July 1984, 00:00 Subject: No subject In-Reply-To: Msg of 10 Jul 1984 08:50-EDT from Patrick G. Sobalvarro Message-ID: Date: 10 July 1984 08:50-EDT From: Patrick G. Sobalvarro I found MC bug-halted. I dumped it to .;10JUL CRASH; dunno if anyone wants to poke at it. Actually its in CRASH;10JUL CRASH. While returning from a RENAME a job discovered it still had some locks locked. MC was parity erroring a bit at the time. Should we worry about this harder? From JNC at MIT-MC Wed Jul 11 00:00:00 1984 From: JNC at MIT-MC (J. Noel Chiappa) Date: 11 July 1984, 00:00 Subject: SECOND: hardwarily back Message-ID: Drive 1 is fixed, but I can't remember how to invoke UCOP to bring the directories over. Someone should bring the system down and bring it back up with SECOND: around. From GLR at MIT-OZ Wed Jul 11 03:10:00 1984 From: GLR at MIT-OZ (Jerry Roylance) Date: Jul 10 1984 21:10 EDT Subject: Subnet 6 Message-ID: Subnet 6 was broken at the MC bulkhead (return of the bad UHF connectors). The MC transceiver's power connector was also broken; everything is now in order. From PGS at MIT-MC Tue Jul 10 00:00:00 1984 From: PGS at MIT-MC (Patrick G. Sobalvarro) Date: 10 July 1984, 00:00 Subject: No subject Message-ID: I found MC bug-halted. I dumped it to .;10JUL CRASH; dunno if anyone wants to poke at it. From ALAN at MIT-MC Tue Jul 10 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 10 July 1984, 00:00 Subject: No subject In-Reply-To: Msg of 10 Jul 1984 00:19-EDT from David C. Plummer Message-ID: Date: 10 July 1984 00:19-EDT From: David C. Plummer I'm still not a speed reader. (I'm coming in over a vadic, AAA, ITS 1370, DDT 1480.) U prints the bye message and then clears the screen. OK, OK. I fixed it again. If its still broken, perhaps you better see Evelyn Wood, because I can't imagine how I can have failed to fix it this time. From DCP at MIT-MC Tue Jul 10 00:00:00 1984 From: DCP at MIT-MC (David C. Plummer) Date: 10 July 1984, 00:00 Subject: No subject Message-ID: I'm still not a speed reader. (I'm coming in over a vadic, AAA, ITS 1370, DDT 1480.) U prints the bye message and then clears the screen. From ALAN at MIT-MC Mon Jul 9 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 9 July 1984, 00:00 Subject: oops In-Reply-To: Msg of 9 Jul 1984 10:14-EDT from David C. Plummer Message-ID: Date: 9 July 1984 10:14-EDT From: David C. Plummer Either ITS was changed, or DDT was, and for the worse. When I log out, it does a :BYE followed by :OUTEST. I don't think either of these are at fault. What happens is that it prints the BYE message, and then clears the screen before printing the console free message. Needless to say, I'm not a speed reader. Believe it or not, another symptom of this same bug was the fact that :CHUNAME didn't work anymore. I fixed the bug and installed a new DDT. From CSTACY at MIT-MC Mon Jul 9 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 9 July 1984, 00:00 Subject: No subject Message-ID: I de-installed the latest DDT, since there are all these bug reports coming in and the responsible hacker is asleep now. From CSTACY at MIT-MC Mon Jul 9 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 9 July 1984, 00:00 Subject: ml doesn't get updates In-Reply-To: Msg of 9 Jul 1984 06:18-EDT from Pandora B. Berman Message-ID: ML should be getting updates; when it came back up I put it back in the list of machines to update and tested it. It worked for me; I'll look to see what is going wrong. From JPG at MIT-MC Mon Jul 9 00:00:00 1984 From: JPG at MIT-MC (Jeffrey P. Golden) Date: 9 July 1984, 00:00 Subject: CHUNAME Message-ID: When I am logged in as JEFFG and do :CHUNAME JPG It says OP? and won't chuname me. From DCP at MIT-MC Mon Jul 9 00:00:00 1984 From: DCP at MIT-MC (David C. Plummer) Date: 9 July 1984, 00:00 Subject: No subject Message-ID: Either ITS was changed, or DDT was, and for the worse. When I log out, it does a :BYE followed by :OUTEST. I don't think either of these are at fault. What happens is that it prints the BYE message, and then clears the screen before printing the console free message. Needless to say, I'm not a speed reader. From CENT at MIT-MC Mon Jul 9 00:00:00 1984 From: CENT at MIT-MC (Pandora B. Berman) Date: 9 July 1984, 00:00 Subject: ml doesn't get updates Message-ID: i modified my inquir entry a week ago to reflect my new office. ML still thinks i'm in 912. this is a bug. As long as ML is going to stay around for the indefinite (and probably not too short) future, would whoever diked it out of the inquir-update path please put it in again? From CSTACY at MIT-MC Sun Jul 8 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 8 July 1984, 00:00 Subject: MLDEV Message-ID: MLDEV does not work between ITS machines except via Chaosnet. Reminder to me to make it check for alternate host addresses on the ARPAnet and try them if Chaos is unresponsive. From CBF at MIT-MC Sat Jul 7 00:00:00 1984 From: CBF at MIT-MC (Charles Frankston) Date: 7 July 1984, 00:00 Subject: The saga of Subnet 6 Message-ID: MC-IO-11 and TOTO (OZ's network front end) don't communicate on subnet 6. Each one can apparently communicate with most of the other machines on subnet 6, but not with each other. DPH said that subnet 6 was extended a few days ago, which is apparently when this problem started. My guess is that TOTO can send to MC-IO-11 since TOTO's routing table says to use subnet 1 to talk to MC rather than subnet 6, but for some reason MC-IO-11 probably tries to use subnet 6 to talk to TOTO and this loses. Unplugging the MC-IO-11 subnet 6 interface allows MC and OZ to communicate. This is the state things have been left in. However, this causes hosts which do not implement dynamic routing, such as ML and MIT-VAX to be unable to talk to MC, since they think that subnet 6 is the way to get there.. From ALAN at MIT-MC Sat Jul 7 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 7 July 1984, 00:00 Subject: MC (not) broken In-Reply-To: Msg of Fri 6 Jul 1984 23:34 EDT from PGS at MIT-OZ Message-ID: Date: Thursday, 5 July 1984 18:12-EDT From: Christopher C. Stacy Something is wrong with MC's hardware. ITS is having trouble talking to the T-300s, and can barely reach the Chaosnet (even though the ncp statistics look reasonable). Users were getting wedged today unable to log out; one of my jobs was stuck in a CLOSE call with a FLSINS of zero. DPH reports that the IO-11 dropped very dead earlier. Just for the record, DPH disconnected MC from subnet 6 and all the Chaos net problems went away. Other wierdnesses reported by CStacy may or may not have anything to do with this. DPH should put himself on Bug-ITS. From PGS at MIT-OZ Sat Jul 7 05:34:00 1984 From: PGS at MIT-OZ (PGS at MIT-OZ) Date: Jul 6 1984 23:34 EDT Subject: MC broken In-Reply-To: Msg of 5 Jul 1984 18:12-EDT from Christopher C. Stacy Message-ID: Date: Thursday, 5 July 1984 18:12-EDT From: Christopher C. Stacy To: BUG-ITS at MIT-MC cc: BEDE at MIT-XX, MOON at SCRC-STONY-BROOK Re: MC broken Something is wrong with MC's hardware. ITS is having trouble talking to the T-300s, and can barely reach the Chaosnet (even though the ncp statistics look reasonable). Users were getting wedged today unable to log out; one of my jobs was stuck in a CLOSE call with a FLSINS of zero. DPH reports that the IO-11 dropped very dead earlier. Two mysteries: 1. What are these blue wires running around MC? Someone told me they are an attempt to "make MC think it is talking to the ROLM switch by faking out the DL-10 or something?" I suspect that this is just Ty trying to connect Rolm switch lines to normal tty lines (not under modem control). If you find out that it's something else, would you let me know? From ALAN at MIT-MC Thu Jul 5 00:00:00 1984 From: ALAN at MIT-MC (Alan Bawden) Date: 5 July 1984, 00:00 Subject: :HOSTAT In-Reply-To: Msg of 5 Jul 1984 14:21-EDT from Patrick G. Sobalvarro Message-ID: Date: 5 July 1984 14:21-EDT From: Patrick G. Sobalvarro To: BUG-ITS :hostat ai-chaos-11 gives sysbin; hosts1 > - file not found. I deleted the obsolete HOSTAT program so that people would stop being confused. From CSTACY at MIT-MC Thu Jul 5 00:00:00 1984 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: 5 July 1984, 00:00 Subject: MC broken Message-ID: Something is wrong with MC's hardware. ITS is having trouble talking to the T-300s, and can barely reach the Chaosnet (even though the ncp statistics look reasonable). Users were getting wedged today unable to log out; one of my jobs was stuck in a CLOSE call with a FLSINS of zero. DPH reports that the IO-11 dropped very dead earlier. Two mysteries: 1. What are these blue wires running around MC? Someone told me they are an attempt to "make MC think it is talking to the ROLM switch by faking out the DL-10 or something?" 2. The DEC log book says that the DL-10 had a bad module which was not replaced, but which was moved to another slot and that now everything "works fine". If no one has any good ideas, I will call DEC and have them see about working on the DL-10 again. Meanwhile, the system is not very usable. From PGS at MIT-MC Thu Jul 5 00:00:00 1984 From: PGS at MIT-MC (Patrick G. Sobalvarro) Date: 5 July 1984, 00:00 Subject: No subject Message-ID: :hostat ai-chaos-11 gives sysbin; hosts1 > - file not found. From GUMBY at MIT-MC Wed Jul 4 00:00:00 1984 From: GUMBY at MIT-MC (David Vinayak Wallace) Date: 4 July 1984, 00:00 Subject: No subject Message-ID: MC appears to be unable to talk to OZ (finger, telnet, supdup), but finds is with :UP. Has no trouble with other chaos hosts I tried (prep, eddie, speech). From MOON at MIT-ML Wed Jul 4 00:00:00 1984 From: MOON at MIT-ML (MOON at MIT-ML) Date: 04 Jul 1984 00:00 Subject: HOSTAT OBERON Message-ID: :HOSTAT is an ancient program that retrieves a report on Arpanet hosts from DM. Obviously it should be flushed in favor of a program that uses Chaosnet HOSTAT protocol as PSZ evidently expected.. From GSB at MIT-MC Wed Jul 4 00:00:00 1984 From: GSB at MIT-MC (Glenn S. Burke) Date: 4 July 1984, 00:00 Subject: [Forwarded: PSZ@MIT-MC, Re: ] Message-ID: Date: 03 Jul 1984 00:00 From: PSZ at MIT-MC :HOSTAT OBERON on MC says: sysbin;hosts1 > - file not found.