Server down

More
10 years 3 months ago - 10 years 3 months ago #1169 by SirParadox
Replied by SirParadox on topic Re: Server down
Ok when I got to work today I was informed (via aim) that nobody could play - connect. I saw the Region Process had exited. This has been happening lately, first to Primary, that went away, and now to Region. We are hurting region, and it should be running in native 64bit mode, split up into many regions.. so we are taxing its limits. However that's no cause for the process to terminate, with the debugger attached, and get no intel. I decided Friday to order 8gb more ram, and it was on my desk this AM.

But I couldn't install it yet, busy day in the office, etc.

So we did a Safe (Save) down..

I knew it'd be a few hours so in the background I did a full database backup, and setup some live table copies, with the intent of finally testing my 'New Save' methods; diffing after to verify nothing changed between 'Old save' and 'New save'. etc..

I found some nice bugs! but borked my code and had to end that beta test of my code. :(.. Plus with work getting real busy it was distracting me too much.

So we also took that opportunity to make some Procedural definition changes to Trade - Trade History.. with the intent that when a trade is complete, it passes on full info to the history section. A player had some trades finish, complete, but units never changed ownership. And I was unable to back track to the UnitID. This will fix that.

I got the core of the procedure definitions changed, but all the calls to the subs I just threw in -1, -1, -1, -1; with the intent I can fill that shit out later; I only needed the defintion to support the extra args, which is a 'down time' change.

OK. So I installed the new ram, or at least my co-worker did for me. Looks like we mis-matched it because then I was getting a BSOD. I swapped the ram into a different config (YYYYXXXX) where Y = new X = old. It came up happy as can be but Windows only sees 8GB of ram, however SIW (great app on the internets) shows Phys 16, Avail to windows 8.

I hunted everything down.
x64 [check]
Running in 64bit os mode [Check]
Win 2008-R2 STD (License allows up to 32G) [Check]
SysConfig -> Advanced -> boot -> Memory limit (Was set to unlimited, set it to 16G) -> Rebooted -> Then it saw 6, undid that, rebooted.. blah. [Check].

SIW will not recognize the old ram beyond 'size'. The rest is unknowns.
NExt two steps are 1..Crucial.com's server detector detects my Main board, and shows 4 slots empty!! Wth.

So I am having one of my NOC guys swap New and Old into (XXXXYYYY) setup. We'll see from there.

The next bigger step would be to flash the bios. I am fearful of this cuz I can still bring BP up right now, with 8gb of ram like it was. If we brick it I won't be able to fix it tonight via a NOC tech, I'll need my own department (IT) hardware doobies take a look, and they are quite busy this week. We have the USB stick, and image ready. Sadly you have to set a freken Jumper to bios flash.. and my NOC tech, while hardware wizards, are afraid of Bricking 'Ryans' server.. lol.

So maybe anohter half hour.. maybe another hour or too... maybe tomorrow.. We shal see. Right now box is unpingable,

== Update == 7:02pm
Noc guy reports that no matter how he puts the ram in. The bios now reports 8 G, but shows '8 chips installed, 4 disabled. And no setting to enable. We are going back to the baseline of XXEEXXEE where X = old ram E = empty slot AAAABBBB are the banks, so we had 4 chips, 2 per bank. After this boot.. if we see the origional 8 gigs, ok, try new ram there. If we see 4 gigs then we say 'wtf bank b died?' ..

== Update == 7:13pm
Ok now in the original layout, with old ram, and it locks entering the bios! HAH

== Update == 7:28 PM
Locked with XXeeXXee setup. He did XXeeYYee. it shows in the bios 8g installed 4g avail, bank B disabled. so there must be a freken timing issue, old ram was suposed to be DDR2 667 (PC2-5300) FB ECC. So that's what I ordered for the upgrade. I am wondering since SIW reports unknowns, perhaps the old ram was in fact DDR2 533 PC2-5300 FB ECC, but I thought the pc3-5300 gaurenteeded us the speed.

He can't find any settings to enable the bank. I told him to get 8g new ram solo working, and go back to his job. I tried to call back to have him check Bios for 'Timing' 'Voltage' or 'reamapping' but he didnt answer.

So I think for tonight the server will come back up with only 8GB but at least this ram is known good ram.

[Specs]
New Ram
www.newegg.com/Product/Product.aspx?Item=N82E16820134633

MB-Intel S500VSA
download.intel.com/support/motherboards/...000vsa_userguide.pdf,
Page 10 (pdf page 28) shows the memory specs - warnings
page 20 (pdf page 38) figure 7 shows the Bank layout

== Update == 7:40pm
Ok box is up, pingable, accessable. Only 8gb ram :( But I'll bring the game up now :)
Last edit: 10 years 3 months ago by SirParadox.

Please Log in to join the conversation.

More
10 years 3 months ago #1170 by Spaceweed
Replied by Spaceweed on topic Re: Server down
Try increasing the voltage in the second bank, it's a long shot, but I've seen it fix that problem before (not on a server board mind, but it's woth a try, just don't overvolt).

I assume memory remapping is enabled in the bios too.

Wait, it now locks in the old config, with all the same memory in the same slots?

Please Log in to join the conversation.

More
10 years 3 months ago - 10 years 3 months ago #1171 by Spaceweed
Replied by Spaceweed on topic Re: Server down
Is it me or can I not edit my posts? :P
Sorry for going AWOL, I'll be gone after this until tomorrow, if the problem persists I can throw more suggestions your way, I was always better with hardware than software, I'm a bit of a simple repairman :P

If the old RAM is PC2-5300 then it should run at 667Mhz like the new ones, you checked the CAS latencies are the same, right? It may also be worth looking into the supported timings of the old and new RAM sticks, should be able to see it in SIW anyway. The new RAM may also run at a lower or higher voltage than the original memory, putting the timings out of sync to the point of locking up the system. Again something to look at in the bios. Though DDR2 is normally set to run at 1.8v so that may not be it, or the motherboard isn't supplying enough volts to the second bank, which could happen.

Edit: I'm tired, to be honest looking at it again if you got as far as booting into windows the first time it's not a voltage, or bad memory problem, set it back to YYYYXXXX and hopefully if it still works try the suggestions below.

If you find yourself in the old situation of 16GB physical 8GB avail look into memory mapping or redundant memory mirroring in the BIOS or the Physical adress extenstion in Windows itself. I could give more info on these things, but a man of your caliber should be able to figure it out ;) and besides, I'm tired :P

Edit: Oh I wasn't logged in that's why I couldn't edit :S
Last edit: 10 years 3 months ago by Spaceweed.

Please Log in to join the conversation.

More
10 years 3 months ago #1172 by SirParadox
Replied by SirParadox on topic Re: Server down
In SIW both old and new CAS timings said '5.0-5-5-15-20 @ 333 mhz).

Server now has 4 new sticks, SIW shows all the details. Will physically inspect the old chips tomorrow if I have time.

!!!Server is up!!!

Please Log in to join the conversation.

More
10 years 3 months ago - 10 years 3 months ago #1173 by SirParadox
Replied by SirParadox on topic Re: Server down
here is the memory infos. I don't like that is reported wrong from the bios. So we might go ahead with the bios update tomorrow. Horus pointed out we have redundant bios so I can brick one :P

-- Forum wont let me post a png --
www.afterprotocol.com/images/bp01-memory.png
Last edit: 10 years 3 months ago by SirParadox.

Please Log in to join the conversation.

Time to create page: 0.263 seconds