AMC13 Debugging Hints
My AMC13 is plugged in but I can't contact it!
Check sensor info (NAT MCH)
If you have an NAT MCH, you can learn a lot about the state of your AMC13 (or other MicroTCA module) with the
show_sensorinfo
command. First, connect to your MCH using telnet:
[cms2] /home/hazen/amc13_python/src_amc13 > telnet 192.168.1.41
Trying 192.168.1.41...
Connected to 192.168.1.41.
Escape character is '^]'.
Welcome to NAT-MCH
nat> show_fru
FRU Information:
----------------
FRU Device State Name
==========================================
0 MCH M4 NMCH-CM
3 mcmc1 M4 NAT-MCH-MCMC
8 AMC4 M4 AMC13
13 AMC9 M4 AMC13
30 AMC13 M4 AMC13
40 CU1 M4 VT VT095
41 CU2 M4 VT VT095
50 PM1 M4 VT UTC010
60 Clk1 M4 MCH-DTC
==========================================
The
show_fru
command lists the "field replaceable units" (i.e. cards) plugged in to the crate. It knows about AMC13's, so they appear by name (in fact it currently mis-identifies
MiniCTR2 as AMC13!). We have a
MiniCTR2 in slot 4, an AMC13 in slot 9 and another AMC13 in the MCH2 sigte (FRU 30).
nat> show_sensorinfo 30
Sensor Information for AMC 13
========================================================
# SDRType Sensor Entity Inst Value State Name
--------------------------------------------------------
0 MDevLoc 0xc1 0x7a AMC13
0 Full 0xf2 0xc1 0x7a 0x01 Hotswap
1 Full Temp 0xc1 0x7a 33.6 ok Amb. Temp
3 Full Voltage 0xc1 0x7a 12.544 ok +12V
4 Full Voltage 0xc1 0x7a 3.2688 ok +3.3V BkEnd
5 Full Voltage 0xc1 0x7a 1.1932 ok T2 1.2V
9 Full 0xc0 0xc1 0x7a 0x83 0x00 GPIO 7:0
--------------------------------------------------------
nat>
The
Hotswap sensor indicates the state of the switch connected to the handle (actually, the MMC state machine driven by it). When the module is correctly plugged in and the handle pushed in, it should read 0x01 as above. The values gleaned from the MMC firmware listing are as follows:
(bit values given as an 'or' in the readout):
01 - handle closed
02 - handle open
04 - quiesced (??)
08 - backend power failure
10 - backend power shutdown
Also, the +12V should read something like 12V.
IP Address set incorrectly or unknown
What IP address is your AMC13 using? You can find out as follows.
$ cd ...../dev_tools/amc13Config
$ #--- edit systemVars.py to set your MCH IP address ---
$ ./scanCrate.pl
1: MMC: -none-
2: MMC: -none-
3: MMC: -none-
Opening 192.168.20.200...
AMC13Tool2 threw an exception
Address table path "/home/hazen/work/new/amc13/amc13/etc/amc13" set from AMC13_ADDRESS_TABLE_PATH
use_ch false
Created URI from IP address:
T2: ipbusudp-2.0://192.168.20.200:50001
T1: ipbusudp-2.0://192.168.20.201:50001
Caught microHAL exception.
4: MMC: 2.2 IP: 192.168.20.200 192.168.10.100 vv: 0x0000 sv: 0x0000 sn: 0
5: MMC: -none-
6: MMC: -none-
7: MMC: -none-
8: MMC: -none-
Opening 192.168.1.168...
9: MMC: 2.1 IP: 192.168.1.168 192.168.1.169 vv: 0x4037 sv: 0x0025 sn: 43
10: MMC: -none-
Opening 192.168.1.56...
11: MMC: 2.2 IP: 192.168.1.56 192.168.1.57 vv: 0x4037 sv: 0xfff9 sn: 227
Opening 192.168.1.42...
12: MMC: 2.2 IP: 192.168.1.42 192.168.1.43 vv: 0x100a sv: 0x0021 sn: 106
Opening 192.168.2.188...
13: MMC: 2.2 IP: 192.168.2.188 192.168.2.189 vv: 0x4037 sv: 0x002d sn: 161
You can see that the board in slot 4 has crazy IP addresses (20.200 and 10.100).
You can reset them as follows:
./applyConfig.py --slot=4 -i 192.168.3.248
./storeConfig.py --slot=4 -i 192.168.3.248
This will reset the AMC13 to two successive IP addresses beginning with
192.168.3.248
.
The
applyConfig
sets the address immediately. The
storeConfig
sets it in
the EEPROM for next power-up.
If the above fails for some reason you may need to erase the EEPROM attached
to the MMC. Connect a mini USB cable (not micro USB)
to the front panel, which is the console for the MMC microcontroller.
It should enumerate as a usb emulated serial port, maybe /dev/ttyUSB0.
(look in dmesg to see).
Then connect using a terminal program (minicom is what I use)
set to 19200 baud, 8 data bits, no parity, no hardware or software handshake.
You can type "help" to get a list of commands,
but what you want is "eeperase" followed by "yes".
After this, cycle the crate or module power and the IP address should
be reset to the S/N default (see
IPaddressAssignment).
Then you can use
applyConfig
and
storeConfig
to reset it.
Point 5 Test Crate
Some helpful hints from Jim.
from hcaldaq12:
telnet 192.168.1.41
h
(this gets you a list of commands)
show_fru
(this gets you list of modules)
pwr_off 30
(power off module no. 30)
pwr_on 30
(power on module no. 30)
Remote Hard reset of AMC13
It is rumored that the following will reset an AMC moule if you have an NAT MCH:
I have a good news. After discussing this issue with several people and also with colleagues from HCAL, I discovered that NAT-MCH has a hidden set of commands, which you can simply run from telnet. So there is a working solution for the FPGA<->MMC communication getting stuck. The way I do so:
- "telnet 192.168.1.41" - connect to the MCH using telnet
- "show_pm" - print the list of the units, we will need the FruId from there. So if your FC7 is in slot AMC3, the FruId is 7
- "hidden" - print hidden commands:))
- "hard_reset 7" - reset the FC7
Description is for FC7 but should apply to AMC13 as well
Talking to the MMC over IPMI
The IPMI command for graceful reboot is as follows:
ipmitool -H 192.168.1.240 -U '' -P '' -T 0x82 -B 0 -b 7 -t 0xa4 raw 0x2c 0x04 0 26 0x02
--
EricHazen - 20 Mar 2012