From ALAN at MIT-MC.ARPA Thu May 30 22:06:41 1985 From: ALAN at MIT-MC.ARPA (Alan Bawden) Date: May 30 85 15:06:41 EST Subject: All network ports in use In-Reply-To: Msg of Thu 30 May 85 14:25:10 EST from David Vinayak Wallace Message-ID: <[MIT-MC.ARPA].524704.850530.ALAN> Date: Thu, 30 May 85 14:25:10 EST From: David Vinayak Wallace Every once in a while I am unable to connect to MC via chaos; I open a connexion and MC says "All network ports in use." This is repeatable for about a minute or so, then I get a ddt. loadp shows 11 free net ports. I fixed a bug that used to cause something like this to happen a while ago. A new ITS for MC has not been assembled since then. From GUMBY at MIT-MC.ARPA Thu May 30 21:25:10 1985 From: GUMBY at MIT-MC.ARPA (David Vinayak Wallace) Date: May 30 85 14:25:10 EST Subject: All network ports in use Message-ID: <[MIT-MC.ARPA].524623.850530.GUMBY> Every once in a while I am unable to connect to MC via chaos; I open a connexion and MC says "All network ports in use." This is repeatable for about a minute or so, then I get a ddt. loadp shows 11 free net ports. What gives? From CSTACY at MIT-MC.ARPA Thu May 30 16:16:15 1985 From: CSTACY at MIT-MC.ARPA (Christopher C. Stacy) Date: May 30 85 09:16:15 EST Subject: T-300s inaccessible Message-ID: <[MIT-MC.ARPA].524200.850530.CSTACY> MC's T-300 disk controller was powered off somehow, although the PDP11 and other frobs in the same rack were still on. I power cycled it and reloaded the system to get things going again. From ALAN at MIT-MC.ARPA Wed May 29 22:52:32 1985 From: ALAN at MIT-MC.ARPA (Alan Bawden) Date: May 29 85 15:52:32 EST Subject: Most puzzling... Message-ID: <[MIT-MC.ARPA].523215.850529.ALAN> Typical MC crash today: MC BUGHLT's with an MUUO in exec mode (see apparently typical example in CRASH;CRASH MUUO'). After L XITS G The salvager does -not- print its usual greeting, nor is anything else printed on the system console. However, if you raise switch 0, ITS will stop just as if it was running normally (see example in CRASH;CRASH NOSALV). Now doing L XITS G again will aparently work, except that the system will hang trying to get to unit 3 and you will need to try again after running BOOT11. (Note that something similar to this must have happened to Gumby the other day, except after a different initial crash...) From TIM at MIT-MC.ARPA Wed May 29 09:52:25 1985 From: TIM at MIT-MC.ARPA (Tim McNerney) Date: May 29 85 02:52:25 EST Subject: Yow! I am reliable yet? Message-ID: <[MIT-MC.ARPA].522693.850529.TIM> The time is 02:46:34 EDT. Today is Wednesday, the 29th of May, 1985. MC ITS 1488 has run for 103 days, 2 hours, 13 minutes, 32 seconds. Surpassing all previous MC records for uptime! System last revived 1 day, 2 hours, 12 minutes, 33 seconds ago. From GUMBY at MIT-MC Sun May 26 10:01:18 1985 From: GUMBY at MIT-MC (David Vinayak Wallace) Date: May 26 85 03:01:18 EST Subject: latest crash Message-ID: <[MIT-MC].518640.850526.GUMBY> I came in and MC was down; I looked at the pc; couldn't figure out what happened so saved it in crash;crash gmmpp (it died at gmmpp+3). I loaded a new xits, but the salvager never ran. I halted it with switch 0 and reloaded from scratch. From CSTACY at MIT-MC Sun May 26 04:28:34 1985 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: May 25 85 21:28:34 EST Subject: AI COMSAT down for the count. In-Reply-To: Msg of Sat 25 May 85 01:19:30 EST from Pandora B. Berman Message-ID: <[MIT-MC].518526.850525.CSTACY> This was because COMSAT assumes that the DEAD-MAIL-RECEIPTS list always exists, but I forgot to create it in NAMES >. I convinced the COMSAT on AI to gobble the latest NAMES before trying to unqueue anything, things are back to normal now, and I re-installed the mailer on AI. Under certain conditions (such as fatal interrupts) AUTPSY comes out empty. COMSAT was probably dying with PDL overflows or something. (BTW, I already fixed the other unrelated looping bug which ALAN reported last week.) From CENT at MIT-MC Sat May 25 08:19:30 1985 From: CENT at MIT-MC (Pandora B. Berman) Date: May 25 85 01:19:30 EST Subject: AI COMSAT down for the count. Message-ID: <[MIT-MC].517969.850525.CENT> AI's COMSAT is stuck in a deadly loop. whenever it is launched, it first, of course, tries to cut down on the queued msgs by sending one of them. trouble is, the single one in its queue is to BUG-INQUIR at MC (alias CSTACY) and COMSAT thinks it's from DEAD-MAIL-RECEIPTS at AI. so after sending it (it does get through, each time; cstacy now has several of these in his mail) COMSAT tries to give a CMSG to DEAD-MAIL-RECEIPTS at AI. there is no such address, so COMSAT says ::=. then it tries to CMSG again to the same address. after several times, it dies after just saying CMSG. BURNUP 1 was the first corpse dumped by this process. BURNUP 8 is the latest of several all of the same size (each died after only this work on the queue, while #1 included previous activity). I tried writing a new NAMES > with the guilty address eqv'd to NUL:, as it is on MC; COMSAT did not try to compile this before working on the queued mail. BURNUP 9 is COMSAT dying after i renamed LIST EQV (the bin of NAMES >) to something else; again, COMSAT did not look for the names database before hacking the queued mail. this is a gross bug and should be fixed; if there is no names database, maybe it's for a good reason, like the previous one is fuckt and someone wants comsat to pick up the new (hopefully cured) version. BURNUP 10 is from another experiment: i tried renaming LISTS MSGS to something else, COMSAT looked for NMSGS, found none, and died. alan and moon tried to debug this over the phone, but couldn't get enough data (AUTPSY kept containing 0 !). i have renamed AI:COMSAT LAUNCH to COMSAT BRUNCH. ai now has enough pieces of mail that it won't accept any more from outside. COMSAT should not be relaunched on AI until someone fixes this. From CENT at MIT-MC Mon May 20 10:54:02 1985 From: CENT at MIT-MC (Pandora B. Berman) Date: May 20 85 03:54:02 EST Subject: forwarding... someone forgot to hang up? Message-ID: <[MIT-MC].510876.850520.CENT> Date: May 17 85 22:29:22 EST From: Robyn D. Spencer To: BUG-ROLM at MIT-MC Message-ID: <[MIT-MC].509023.850517.TOOTSE> bug-its I (mason) just connected to mc through a rolm and found myself logged in as tootse, as a mail notification popped up. I just thought someone would like to know. mark From MOON5 at MIT-MC Thu May 16 03:08:30 1985 From: MOON5 at MIT-MC (David A. Moon) Date: May 15 85 20:08:30 EST Subject: MC crashes when you read AI backup tapes Message-ID: <[MIT-MC].505155.850515.MOON5> I wasn't able to figure out the problem in detail, but I did edit a bug trap into the source at MGRD2. Sometime you should exhibit the problem with me physically present and I'll patch in this bug trap and see if it goes off. From CENT at MIT-MC Wed May 15 07:52:40 1985 From: CENT at MIT-MC (Pandora B. Berman) Date: May 15 85 00:52:40 EST Subject: patch for crufty ai tapes makes mc die horribly? Message-ID: <[MIT-MC].503776.850515.CENT> i was trying to read some old AI tapes. they got errors, so with glenn's advice i installed your patch: MOON5 at MIT-ML 06/30/84 03:54:39 To: GSB at MIT-ML, CENT at MIT-ML fyi the location to patch to disable tape read errors in ITS is MGMRT+1 which is changd from JRST MGERR to JRST MGRD2. It still tries a few times to retry errors but if it gets a fatal error it gives you whatever shitty data it got instead of barfing.. ^_ then tried again with the tape. dump worked on tape a short while, with some hiccupping and one reported error, then system died. crash dump in CRASH IMRQ8. alan poked around at the code and thinks that possibly the patch is fucking up the MEMBLT table... From GSB at MIT-MC Tue May 14 01:54:28 1985 From: GSB at MIT-MC (Glenn S. Burke) Date: May 13 85 18:54:28 EST Subject: bugpause: tty: buffer empty at tyirem Message-ID: <[MIT-MC].501533.850513.GSB5> CRASH;TYIREM NOBUFF From KMP at MIT-MC Fri May 10 19:50:00 1985 From: KMP at MIT-MC (Kent M Pitman) Date: May 10 85 12:50:00 EST Subject: No subject Message-ID: <[MIT-MC].495319.850510.KMP> MC has a pile of jobs which have PARERR'd out. Several dead COMSATs, etc. Someone might want to look at this. From GSB at MIT-MC Fri May 10 11:44:57 1985 From: GSB at MIT-MC (Glenn S. Burke) Date: May 10 85 04:44:57 EST Subject: wedgitude Message-ID: <[MIT-MC].494815.850510.GSB> CRASH;WEDGED AGAIN is an example of this evening's wedgitude, dumped after being stopped by switch 0. From JNC at MIT-AI Fri May 10 07:39:42 1985 From: JNC at MIT-AI (J. Noel Chiappa) Date: May 10 85 00:39:42 EST Subject: Zapppp! Message-ID: <[MIT-AI].590.850510.JNC> There was some breeze-shooting on the subject of speeding up the DZ output interrupt service routines which I thought I'd note down. Since a 9600 baud line running flat out generates 1000 interrupts/second, and the output ISR looks quite long at the moment, a few lines running flat out will probably bring the machine to its knees. The theory is that we should implement efficient block mode output; i.e. the driver emulates a DMA device and takes big chunks of data each time ITS talks to it. We should blow some of the extra register sets that we have, one per DZ. 8 of the registers would be used for things like holding a pointer to the registers, etc, and temps, and the other 8 would be pointers to output buffers. The pointers would probably be AOBJN style pointers, with the data uppacked one byte per word. (We can't use byte pointers because there isn't any room for a count if so and there aren't enough registers to have counts too, unless we restrict ourselves to 4 general registers and keep the counts in halfwords.) At that point, the output ISR would be about 10 instructions long: OPBASE = 6 ;Base of AOBJN pointer block in regs IORDI A,%DZRCS(D) ;Get CSR TRNN A,%DZCTR ;Transmitter ready? JRST LOSE ;No, spurious interrupt LDB A,[.BP %DZLM,A] ;Get line number MOVE T,OPBASE(A) ;Get AOBJN pointer for that line MOVE TT,(T) ;Pick up data IOWRI TT,%DZRTD(D) ;Write it out AOBJP T,DONE ;All done yet? MOVEM T,OPBASE(A) ;Put updated pointer back JRST 12, at TTYBRK ;Dismiss Or something like that, anyway. From ALAN at MIT-MC Fri May 10 04:00:50 1985 From: ALAN at MIT-MC (Alan Bawden) Date: Thu, 9 May 85 21:00:50 EST Subject: unlocked In-Reply-To: Msg of Thu 9 May 85 18:24:42-EDT from John Wroclawski Message-ID: <[MIT-MC].494266.850509.ALAN> Date: Thu 9 May 85 18:24:42-EDT From: John Wroclawski From: Alan Bawden Subject: Everything is locked I guess it is unlikely, but it would be foolish of anyone else to try to assemble and try anything that uses KS-10 I/O instructions.... Uh huh. Well, let me know when you think they work OK - I do hope to deal with the tape code sometime now that I'm back in town.. The I/O instructions work just fine. Last night we installed Jinx's DZ-11 code and it works too, so the path is clear for you. From KLH at MIT-MC Wed May 8 00:25:44 1985 From: KLH at MIT-MC (Ken Harrenstien) Date: Tue, 7 May 85 17:25:44 EST Subject: No subject Message-ID: <[MIT-MC].490451.850507.KLH> Date: Mon, 6 May 85 21:15:15 EST From: Alan Bawden In-reply-to: Msg of Mon 6 May 85 10:34:28 EST from Christopher C. Stacy If your COMSAT problem has to do with the hair COMSAT's go through when starting up to prevent their being more than one COMSAT, then I believe I remember KLH mentioning this before. Yes, this has happened. Makes one wonder if locks really do work, or if it is just the narrowness of the race window that gives one the impression they work. From ALAN at MIT-MC Tue May 7 12:58:57 1985 From: ALAN at MIT-MC (Alan Bawden) Date: Tue, 7 May 85 05:58:57 EST Subject: Everything is locked Message-ID: <[MIT-MC].489464.850507.ALAN> I guess it is unlikely, but it would be foolish of anyone else to try to assemble and try anything that uses KS-10 I/O instructions until I get a chance to come in and install and verify the new microcode. The macros in KSDEFS are gone and have been replaced with the new I/O instructions I microcoded up. What this means is that everything is probably broken. From GUMBY at MIT-MC Tue May 7 09:34:47 1985 From: GUMBY at MIT-MC (David Vinayak Wallace) Date: Tue, 7 May 85 02:34:47 EST Subject: MC:.;IOEVEL AIBIN In-Reply-To: Msg of Mon 6 May 85 21:26:40 EST from Alan Bawden Message-ID: <[MIT-MC].489263.850507.GUMBY> Doesn't the system console watch .; like sys***;? From ALAN at MIT-MC Tue May 7 04:26:40 1985 From: ALAN at MIT-MC (Alan Bawden) Date: Mon, 6 May 85 21:26:40 EST Subject: MC:.;IOEVEL AIBIN Message-ID: <[MIT-MC].488587.850506.ALAN> Someone deleted the file MC:.;IOELEV AIBIN. It was either someone who reads this mailing list who thought he was cleaning up, or it was a random. In the later case there is nothing to be done, but in the former case I can prevent this from happening again by reminding you all that AI-11 still runs, and still network boots occasionall. From ALAN at MIT-MC Tue May 7 04:15:15 1985 From: ALAN at MIT-MC (Alan Bawden) Date: Mon, 6 May 85 21:15:15 EST Subject: No subject In-Reply-To: Msg of Mon 6 May 85 10:34:28 EST from Christopher C. Stacy Message-ID: <[MIT-MC].488567.850506.ALAN> If your COMSAT problem has to do with the hair COMSAT's go through when starting up to prevent their being more than one COMSAT, then I believe I remember KLH mentioning this before. If this is a case of initializing a shared database using the algorithm given in .INFO.;ITS LOCKS, then perhaps the unlikely screw case documented there managed to actually happen. From GSB at MIT-MC Tue May 7 01:14:42 1985 From: GSB at MIT-MC (Glenn S. Burke) Date: Mon, 6 May 85 18:14:42 EST Subject: No subject Message-ID: <[MIT-MC].488010.850506.GSB> Date: Mon, 6 May 85 10:34:28 EST From: Christopher C. Stacy I just had a problem with COMSAT similar to the problem with PWORD the other day, where locks did not get unlocked. As in the previous case, the involved code has been stable for a long time (years). Is it possible that locks have become broken lately somehow? I heard there are bugs in the mechanism but I don't know what they are -- maybe I am just hitting some kind of screw case lately. I remember tracking down and patching the pword lock at least once before, years ago (on AI). Maybe once on ML too. I don't believe we knew what caused the problem. From CSTACY at MIT-MC Mon May 6 17:34:28 1985 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: Mon, 6 May 85 10:34:28 EST Subject: No subject Message-ID: <[MIT-MC].487059.850506.CSTACY> I just had a problem with COMSAT similar to the problem with PWORD the other day, where locks did not get unlocked. As in the previous case, the involved code has been stable for a long time (years). Is it possible that locks have become broken lately somehow? I heard there are bugs in the mechanism but I don't know what they are -- maybe I am just hitting some kind of screw case lately. From ALAN at MIT-MC Sun May 5 07:53:11 1985 From: ALAN at MIT-MC (Alan Bawden) Date: Sun, 5 May 85 00:53:11 EST Subject: No subject In-Reply-To: Msg of Sat 4 May 85 19:24:06 EST from Glenn S. Burke Message-ID: <[MIT-MC].485622.850505.ALAN> Had there been paper the message would have read: "TTY: OUTPUT BUFFER POINTER PAST END OF BUFFER". I added this crash to the collection. From GSB at MIT-MC Sun May 5 02:24:06 1985 From: GSB at MIT-MC (Glenn S. Burke) Date: Sat, 4 May 85 19:24:06 EST Subject: No subject Message-ID: <[MIT-MC].485447.850504.GSB5> mc crashed, was in ddt. no paper in the system console. dumped to crash;look later if anyone is interested and can find anything interesting from it... From JNC at MIT-MC Fri May 3 21:39:42 1985 From: JNC at MIT-MC (J. Noel Chiappa) Date: Fri, 3 May 85 15:39:42 EDT Subject: Drugs Message-ID: <[MIT-MC].483905.850503.JNC> No, I'm not on them. I know what it looks like when PEEk gets an MPV and that wasn't it. I'm not sure what it was (it wasn't that the job had got and unhandled MPV, which would have shown up in the N display) but it wasn't PEEK losing its ass. From CSTACY at MIT-MC Fri May 3 21:33:57 1985 From: CSTACY at MIT-MC (Christopher C. Stacy) Date: Fri, 3 May 85 15:33:57 EDT Subject: No logins In-Reply-To: Msg of Fri 3 May 85 12:48:09 EDT from J. Noel Chiappa Message-ID: <[MIT-MC].483895.850503.CSTACY> Date: Fri, 3 May 85 12:48:09 EDT From: J. Noel Chiappa To: BUG-ITS cc: JNC Re: No logins PWORD was getting MPV's. I couldn't fix this (and backing up to an older PWORD didn't fix it) so I temporarily flushed PWORD and replaced it with DDT so that at least people could log in. Although JNC doesn't think so, I am convinced that he was faked out by a bug in PEEK which sometimes causes it to print out "MPV not handled" when it screws up. When I went and ran PWORD, I did not get any kind of MPV problems, but I did get hung after the command prompt. Somehow, the lock word in the password database was set and never cleared, so all jobs would .HANG forever as soon as they tried to access the database. The job which had claimed the database was long gone. I patched the password database back into working order and re-installed PWORD. The code for PWORD has not been changed in a long time, so I suspect that some kind of hardware problem (or human) somehow screwed up the ITS locks critical routine feature, or munged the database. From JNC at MIT-MC Fri May 3 18:48:09 1985 From: JNC at MIT-MC (J. Noel Chiappa) Date: Fri, 3 May 85 12:48:09 EDT Subject: No logins Message-ID: <[MIT-MC].483484.850503.JNC> PWORD was getting MPV's. I couldn't fix this (and backing up to an older PWORD didn't fix it) so I temporarily flushed PWORD and replaced it with DDT so that at least people could log in. From JGA at MIT-MC Fri May 3 18:32:32 1985 From: JGA at MIT-MC (John G. Aspinall) Date: Fri, 3 May 85 12:32:32 EDT Subject: No subject Message-ID: <[MIT-MC].483460.850503.JGA> MC has been up all morning, but refuses to establish proper connections for incoming network telnets and supdups. Dialups also appear flaky, but I'm not sure about this. Symptoms - you get the telser message "MC Maximum Confusion PDP-10" and then nothing. Finger from another site shows lots of un-logged-in HACTRNs sitting there. Note that outgoing chtn, telnet, supdup are fine. Incoming finger is fine too. I don't know enough about what's going on to do anything constructive, but I'd be interested in finding out what the diagnosis is.