This guide mainly describes the troubleshooting of various faults of the Bitmain Antminer Z15 hash board, and the accurate positioning using the test fixture.
Table of Contents
I. Requirements for Maintenance Platform
1. Soldering small patches, such as chip resistors and capacitors, requires a soldering iron with a pointed tip and a constant temperature range of 370°C to 400°C.
2. The chip disassembly and welding processes use a mobile desoldering gun and a BGA repair station. Take care to avoid overheating to prevent PCB foaming.
3. The hash board is tested and measured using an APW3/APW3+/APW5 power supply (output 12V, 140A Max) and a home-made power adapter cable.
4. Tweezers, a Fluke multimeter, and a V9 test fixture tool (an oscilloscope with conditional configuration).
5. Board washing water is used to clean the flux residue and appearance following maintenance; anhydrous alcohol should be added to the flux and board washing water.
6. Tin tool, Tin tool steel sheet, solder paste; when changing a new chip, the PIN pins’ surface pads need to be cleaned flat, and the chip should be soldered with a BGA repair station after tinning.
7. After repair, antminer thermosetting adhesive was applied to the chip and heat sink.
II. Job Requirements
1. Personnel performing maintenance must be skilled in BGA/QFN/LGA packaging, welding, and specific electronic knowledge in addition to having more than a year of maintenance experience.
2. The hash board needs to be tested more than twice after maintenance and pass with flying colors each time!
3. When changing the chip, be mindful of the procedure. There is no obvious deformation of the PCB board after any accessories have been replaced. Check for open and short circuit issues in the replaced parts and the surrounding components.
4. A maintenance object’s corresponding test software parameters and test tools should be identified.
5. Check to see if the test fixture and tools can operate normally.
III. Principle and Structure
1. Principle Overview
1) Three BM1746 chips make up Z15, and each of them is managed by a different power management IC.
2) The BM1746 chip, which is utilized by Z15, has a working voltage of 0.78V and a LDO that supplies power to VDDIO at 1.8V and VDDPLL at 0.7V.
3) There are test points on the chip surface of the PCB board, and after the repair or replacement of the chip has passed the test, it is necessary to apply thermal paste evenly on the IC surface and tighten the screws to make sure that the heat sink is in contact with the chip to achieve heat dissipation. The Z15 clock is a 25M crystal oscillator transmitted from the first chip to the third chip in series. For maintenance during production, if there is no heat sink, please use a fan to dissipate heat to the PCBA during the power-on test and analysis process. For inspection and maintenance, it is advised to use low frequency 200M, and tests performed with this frequency value pass with the post-installation heat sink in place.
2. Signal Direction of Z15 Hash Board:
CLK (XIN) signal flow, generated by Y1 25M crystal oscillator, transmitted from chip 01 to chip 03; during operation, the voltage is 0.85V.
When the IO line is not inserted, the voltage is 0V, and the power supply during operation is 1.8V. The TX (CI, CO) signal flows from the IO port 7 pin (3.3V) into the level conversion IC U2, and then from the 01 chip to the 03 chip transmission.
Direction of the RX (RI, RO) signal flow, starting at chip No. 03 to chip No. 01, through U1 to the signal cable terminal’s pin 3 and then back to the control board; the voltage is 0.3V without the IO signal inserted and 1.8V when it is operating.
Signal flow for BO (BI, BO), coming from chip No. 01 to No. 3; the multimeter reads 0V.
As there is no IO signal inserted, the RST signal, which is 0V in standby and 1.8V in operation, flows in from IO port 3 and is then transmitted from chip 01 to chip 3.
3. the Critical Circuit of the Z15 Hash Board
Test 10 signal voltages before and after the chip (five of which are CLK, CO, RI, BO, and RST before and after the chip), the CORE voltage, and LD0-1. 8V, PLL 0.7V; voltage range 12V to 5V.
Detection method (each chip has a lead-out test point):
The PIC will function after you plug in the I0 line and press the test fixture test button. At this time, the normal voltage of each test point should be:
CLK: 0.85V
CO: 1.6-1.8V.
RI: 1.6 to 1.8V; abnormal or too low voltage will result in an abnormal hash board or a hash rate of zero.
BO is 0V when no hash operation is performed and 0V when one is.
RST: 1.8V. Each time the test key on the test fixture is depressed, the reset signal is output once more.
When the voltage and test point status are abnormal, please determine the fault point by comparing the circuits before and after the test point.
IV. the Failure and Performance of the Lousy Phenomenon
1. The PLL 0.7V or LDO 1.8V is abnormal. The entire miner indicates that the chip is missing, the board has been dropped, or the chip has been crossed while the single-board indicates that the chip has not been fully read and the data has been crossed.
2. The working voltage of the chip is abnormal; this is a fault phenomenon; the number of chips read by the single board performs abnormally; the entire miner is deficient in chips or loses the board. The corresponding value can be measured according to the configuration setting value, which is 0.78V by default.
Please double-check PIC programming and soldering. Please review the soldering and programming of the corresponding power management IC. As only one circuit example is provided, check to see if the MOS of the corresponding group is shorted.
1)If there is a replacement power management IC, please follow the steps below for online programming
a. Power controller firmware (there is no program download address, the maintenance party needs to apply for program requirements with the docking department)
The download line programming program (general maintenance general) is as follows
BEZ36601_PWR_V0.2.hex
b. Download tool
Intersil PMbus download cable and interface sequence are as follows:
The cable of download cable is connected to J10 of the PCB (position 2 in the figure below), and it needs to be connected to GND, SDA, and SCL.
c. Burning software
Run the Production Configuration Tool, click position 1 to select the latest firmware BEZ24601_V03.hex, and then change position 2 to 0x50 (set according to the corresponding chip address:
The No. 01 chip’s address is set to 0x50, while the No. 02 chip is set to 0x54, and the address of No. 03 chip is set to 0x58), choose the load configuration in position 3, and then press Run to begin programming.
There are only eight programming cycles left in the power management IC, so use caution when programming.
2)Burn the PIC program of the hash board.
a. Procedure
PIC16LF1704-BM1840-APP.X.production-1903081005-V4.hex
b. Download tool
Only pins 1, 2, 3, and 4 of the PICkit3 data cable and pin 1 of J3 on the PCB board need to be connected in order to use the PICkit3.
c. Burning software
Open MPLAB IPE, select ①device: PIC16F1704, click ②Browse to select the .hex programming file, then click ③connect, the connection is standard, then click program and press button, click verify after completion, and it will prompt that the verification is completed to prove that the burning is successful.
3. 3.3V abnormal or no output. Check all 3.3 V-related circuits. Make sure there are no shorts to the ground by checking the soldering first.
4. 5V is abnormal, refer to the above “1” point, 12V output is normal, such as 5V has no short circuit to ground (no short circuit corresponding to If there is no abnormality in the welding, the material can be replaced. If there is a short circuit, certain circuits and components of the parallel MOS must be examined.
5. The chip signal pin output is incorrect (BO/RST/CO/RI/CLK). Identify the incorrect position based on the signal direction.
Before measuring the chip-to-ground impedance (in comparison to the superior board or nearby group), please turn off the power. Then, if it’s feasible, you can use X-RAY to examine the chip’s ability to weld.
6. The temperature reading is out of the ordinary. The bad board is indicated as temp NG (the fixture interface can display temp NG and the log synchronization test result); the entire machine indicates that the temperature reading is 0°C or that no temperature can be read.
The temperature is sensed to the corresponding BM1744 chip. Therefore, please check whether the temperature sensor chip welding, temperature sensor working voltage, etc. are normal for the fault.
7. An example of a fault phenomenon is the returned nonce’s insufficiency. The single board’s performance indicates that there aren’t enough nonces being returned by a specific chip, and the entire machine indicates that either the chip is crossed or the error rate is high. It is advised to re-solder the chip or swap out the corresponding NG chip after making sure that the soldering around any bad chips shown in the log is in good condition.
Note: The chip with an abnormal return value can be seen in the test fixture interface and the abnormal display “X.”
V. the Hash Board Maintenance Reference Steps
1. Routine detection: Visually inspect the hash board that needs to be repaired to look for any scorching or PCB deformation. Then, determine whether there are any parts with obvious burn marks, impact offset, missing parts, etc. If there is, it must be taken care of first. If the visual inspection reveals no issues, the next step is to test the impedance of each voltage domain to determine whether a short circuit or an open circuit has occurred. It must be dealt with first if it is discovered. Last, check whether the voltage of each group has 0.78v, and if a group does not measure 0.78V, you need to troubleshoot programming problems or MOS abnormalities.
2. After the routine inspection passes without a hitch (generally, the short-circuit inspection of the routine inspection is required to prevent burning the chip or other components due to a short circuit when the power is turned on), the chip can be tested with a test fixture, and the fault location can then be determined based on the test fixture’s results.
3. According to the display results of the test fixture, start from the vicinity of the faulty chip and detect the chip test points (CO/NRST/RO/XIN/BI) and voltages such as VDD0V8 and VDD1V8.
4. The abnormal fault point is located through the power supply sequence, and the signal flow indicates that all signals—aside from the RX signal, which is transmitted in the reverse direction (chips 3 to 1)—are transmitted in the forward direction.
5. It is necessary to resolder the chip after finding the problematic one. To promote re-running the chip pins and pads and sealing the tin, add flux (preferably no-clean flux) all around the chip and heat the solder joints until they dissolve. for the purpose of re-tinning. In the event that the issue persists despite re-soldering, the chip can be replaced immediately.
6. Before the test fixture is deemed to be a good product, the hash board must pass the test more than twice after being repaired. For the first time, after replacing the parts, kindly allow the hash board to cool before using the test fixture to check the pass and setting it aside before cooling. The second time, test after the hash board has completely cooled down. The short time difference between the two tests has no impact on how quickly the work is done.
7. following the hash board’s repair and approval. It is necessary to have pertinent maintenance and analysis records.