Message boards :
News :
Important information
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
In short, in the server I have a motherboard with two CPU slots, unfortunately one of them is damaged and the equipment works at half power. In the next 2 weeks, I would like to take advantage of the test period of the project and take the motherboard to the service center, where I hope they will repair the slot for the second processor. What will this mean? Additional 8 cores and 16 CPU threads and the ability to add up to 128 GB of RAM <- thanks to this, the project server will receive additional resources for the future. If the project happens to be unavailable, don't be afraid, it's just a service break <- and when not to do it, if not during testing :) Greetings to everyone and I ask for your understanding |
Send message Joined: 21 Jun 24 Posts: 9 Credit: 10,016 RAC: 0 |
good luck with the repair! I crunch for Ukraine |
Send message Joined: 1 Jul 24 Posts: 8 Credit: 27,710 RAC: 75 |
Any idea how long the repair or actually the entire service braek will take? In particular: longer than the deadline of our tasks? In that case perhaps it might be a good idea to stop sending out new ones before the shutdown. |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
Any idea how long the repair or actually the entire service braek will take? In particular: longer than the deadline of our tasks? In that case perhaps it might be a good idea to stop sending out new ones before the shutdown. If servise can do repair it will one day work <- so together with my action i planing maximum 2 days break. |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
ok, after sending some photos of cpu slot to the service they told that they need 2 work days. I am going to turn off server 15th July evening and put it to service on 16th July. I have hope I will back with repaired server on 19th July evening <- in worst case scenario second cpu slot won't be repaired and server will back with acctual parameters. keep fingers crossed :) |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
Don't forget that today is the day when serwer goes down for 2-3 days for service stuff. |
Send message Joined: 22 Jun 24 Posts: 4 Credit: 15,300,366 RAC: 52,124 |
Don't forget that today is the day when serwer goes down for 2-3 days for service stuff. Thank you for the information and updates! Hopefully they will be able to actually fix the motherboard. CPU sockets can be pretty complicated, depending on the exact problem. |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
In short, things that could have gone wrong ended up going wrong. Yes, the server now has 2 processors, but the second one (which caused the server to be serviced) still has 3 pins (if I checked correctly) responsible for RAM support. To sum up, the server now has 2 CPUs 8 cores / 16 threads, i.e. a total of 16 cores and 32 threads. Unfortunately, when it comes to RAM, instead of 256GB it has 192GB <- I don't know if I will consider replacing the 14 16GB sticks with 32GB, but that's for the future. As far as I could, I tested the server under CPU and RAM load, the rest will unfortunately come out during use. So far, the project has started, I have generated 2 series of tasks, and now... I'm going to sleep off two sleepless nights. |
Send message Joined: 21 Jun 24 Posts: 12 Credit: 217,878 RAC: 934 |
No hurry. Get some rest and figure out the rest of the configurations later on. |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
To put it nicely, I am very angry at the service and its actions. The server seems to be unstable and may simply hang during normal operation <- running CPS ect. And all this despite quite rigorous and many hours of CPU, RAM, disk io, etc. stress tests when I wasn't hanging out with server Working on it. |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
i have no idea if is good or bad, but... i found some problems with one of my LSI disk controller in server on last Sunday and Monday <- i will change this durring weekend, luckily i have two spare LSI at this moment i don't see any new problem from 2 days <- i am still monitoring it |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
Acctual server is propably dead :( is freezening all the time from last 15 hours. I buy new stuff but i need wait for it about 3-4 days. I will do my best to keep it live as long as i can to this time. I hope new one will have less humours :( |
Send message Joined: 21 Jun 24 Posts: 25 Credit: 308,854 RAC: 958 |
Acctual server is propably dead :( is freezening all the time from last 15 hours. All the best with that, Good work with the project so far, well done. Conan |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
I am spending all night to observe server and everythink is strange :) when my second CPU goes to 95 degree system halt <- but this is almost ok. when i take out this cpu, my first cpu goes from standard 60 degree to 95degree <- and it's not server load. server load it's 6 points on top or htop when i give cpu from slot 1 to slot 2 and put secod cpu to slot 1 <- cpu on slot 2 goes 95 degree. you can tell that is cooling failture.. almost ok but... Know i have 4 fans with air flow 110m3 per hours each give cooling for whole board system... and yesterday evening i put 2 fans directly on both CPU with about 40m3 per hour, so i have 6 fans with about 500m3 air per hour on close case only cpu give me so high temperature. Disk (i have 30 disks = 24 hdd and 6 ssd) is low have 32-35 degree, every radiotors and parts on board have about 40-45 degree, 3x LSI controlers have 35-40 degree, SPU have 45 degree... only with CPU i have that strange situation. What gives me big shock, when server freeze today morning about 5:30 CEST CPU1 has 55degree, CPU 2 95 degree and radiotor on CPU2 has 105 degree.. wow |
Send message Joined: 27 Jun 24 Posts: 2 Credit: 755,356 RAC: 3,034 |
Sounds like motherboards voltage regulator module (VRM) is broken. It's giving too much voltage for CPU's. Is there any way that you can monitor CPU voltages when operating system is running? |
Send message Joined: 22 Jun 24 Posts: 10 Credit: 559,378 RAC: 1,771 |
I am always still awake at around 3am my time and I noticed that on this laptop this place came back to life but I had to go around to all my other pc's to tell them to send and receive more work from here so I just finished that. (just a few minutes before 10:00 AM UTC) |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
I see no possibilty on this old server <- only CPU core temperatures |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
Sounds like motherboards voltage regulator module (VRM) is broken. It's giving too much voltage for CPU's. I see no possibilty on this old server <- only CPU core temperatures |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
I just got package with new server . I will try to put it on the run in next 2-3 days |
Send message Joined: 12 Jun 24 Posts: 93 Credit: 154,060,246 RAC: 900,485 |
Bad luck continous :( In new serwer two slots for RAM is broken <- i have to back it to seller and get new server. |
©2024 Goofyx Prodakszyn