Tags:
create new tag
view all tags
---++ HCAL Operations HCAL Twiki: https://twiki.cern.ch/twiki/bin/viewauth/CMS/HCALWikiHome HCAL Contact List: https://twiki.cern.ch/twiki/bin/view/CMS/HcalContactList DOC on call: https://twiki.cern.ch/twiki/bin/view/CMS/HcalDOCHowTo DAQ info: https://indico.cern.ch/event/519748/ ---+++ On Call Cheat Sheet ---++++ Setting up Tunnels Follow the [[https://twiki.cern.ch/twiki/bin/viewauth/CMS/HcalOperationsConnectivity][HCAL Connectivity]] twiki to get tunnels working for lxplus,904, and p5 networks. The best choice for windows is to use linux with a virtual box. If you are on lxplus already at cern with a windows computer putty works well too. ---++++ Setup Environment [[BASH][Bash files]] ---++++ Snippets Getting the Hcal config files point 5: (should be the same for 904) $ cfgcvs checkout HcalCfg Very similar to svn $ cvs commit -m "Comment here" $ cvs tag -F pro DTC.cfg ---++++ MCH Commands <verbatim> telnet hcal-mch-20 (the hcal-mch-20 is like in HcalCfg/uTCA/connection.xml with hcal-card-crate-additionaloption) >> show_fru (M4 is on) >> shutdown <fru_number> (Probably do show_fru again to make sure the state goes to M1 of the particular card) >> fru_start <fru_number> >> exit To shut down all cards: >>shutdown all To reboot ( this also restart the MCH so you will be disconnected. Wait ten seconds or so and then telnet back in to see status) >> reboot </verbatim> ---++++ Run Control Websites P5: http://cmsrc-hcal.cms:16000/rcms/gui/servlet/RunGroupChooserServlet CMS TOP: http://cmsrc-top.cms:10000/rcms/gui/servlet/RunGroupChooserServlet 904: http://cms904rc-hcal.cms904:16000/rcms/gui/servlet/RunGroupChooserServlet H2: http://cmshcalTB02:16000/rcms/gui/servlet/RunningConfigurationServlet Building 28: http://cmshcal21:16000/rcms/gui/servlet/RunningConfigurationServlet Configuration Chooser sets up the run. Once to the configuration screen do: Set enable parameters initialized configure start Stop the run before killing the run ---++++ uHTR The uHTR tool is pretty straightforward. The only thing to watch out for is that the back and front fpga firmware should not be mixed up (and they can be…). Also, the versions should match between front and back. This should be both the detector part (HF, HBHE) and the number after it which is the speed (1600 = 1.6Gbps, 4800 = 4.8Gbps). Examples match below: uhtr_front_HBHE1600_1_00_07.mcs.xz uhtr_back_HBHE1600_1_00_00.mcs.xz Correct Use for P5/904: $ uHTRtool.exe -c crate:slot (uHTRtool.exe -c 52:10) Not sure yet what the shell script does (uHTRtool.sh) ---++++ AMC13 Connecting to an AMC13: There is a shell script at p5 and 904 that will use the connection file and then the -i option: AMC13Tool2.exe -c ~hcalsw/uTCA.connections.pro.xml -i hcal.crate$1.amc13 ${@:2} This is the p5 script. The 904 is almost identical but the location of the connection file is hcalsw/HcalCfg/uTCA/connections.xml. Just use this to connect: $ ~hcalsw/bin/AMC13Tool2.sh crate# 6/24/16 To add the AMC13Tool2.sh script to hcalsw used: $ sudo -u hcalsw then_a_command Example to edit the shell script: $ sudo -u hcalsw emacs AMC13Tool2.sh ---++++ Log Files Go to 904 or p5 network Ssh cms904rc-hcal (for 904) Ssh cmsrc-hcal (for p5) Run the handsaw script (this is located in ~hcalsw/bin/): $ Handsaw.pl /var/log/rcms/hcalpro/Logs_hcalpro.xml] $ tail -f /var/log/rcms/hcalpro/Logs_hcalpro.xml | Handsaw.pl Tail streams the errors out. Probably the best to look at. Parsing through old Logs: Copy the log file you want to look at. They are usually in a compressed (.gz). To open them: $ gunzip file.gz Once you have the log file, you can use Handsaw and less to look through them: $ Handsaw.pl logfile.xml | less -R ---++++ Elog + Shiftlift Main Page: https://cmsonline.cern.ch/webcenter/portal/cmsonline/pages_common?wc.contentSource= Direct to Elog: https://cmsonline.cern.ch/webcenter/portal/cmsonline/pages_common/elog Elog: Common -> Elog -> Subsystems -> Hcal -> Hcal, Hcal904, etc Shiftlist: Common -> Shiftlist ---++++ System Manager This long living application writes IP addresses to the uTCA cards based upon their crate and slot. It should detect movement or power cycles and be able to write the new IP address. This does not work properly for the AMC13 yet and instead the application needs to be restarted if an AMC13 is exchanged or moved. 904: P5: (Use on hcalutca01) <verbatim> $ sudo systemctl restart sysmgr </verbatim> Outdated: <verbatim> $ ~hcalsw/bin/restart_sysmgr.sh $ sudo -u hcalpro ~hcalpro/scripts/Service_fix.sh $ sudo -u hcalpro sysmgr ~hcalsw/config_files/Sysmgr/sysmgrHCAL.conf </verbatim> ---++++ Random Killing Stale Xdaq Processes When a run is destroyed there can be processes that were not destroyed. These will interfere with the next run and cause errors on initialize. Processes NOT to kill: root 35246 18.7 7.3 6499672 2399100 ? Ssl Apr20 11439:33 /opt/xdaq/bin/xdaq.exe -h srv-s2f17-19-01.cms -p 9950 -u file.append:/var/log/hcal.xaad.log -e /opt/xdaq/share/hcal/profile/xaad.profile -z hcal root 35247 8.2 0.2 4319652 91404 ? Ssl Apr20 5045:51 /opt/xdaq/bin/xdaq.exe -h srv-s2f17-19-01.cms -p 9999 -u file.append:/var/log/hcal.jobcontrol.log -e /opt/xdaq/share/hcal/profile/jobcontrol.profile -z hcal Example of stale xdaq process to kill: hcalpro 41955 9.1 0.2 4261576 93072 ? Sl 22:37 1:01 /opt/xdaq/bin/xdaq.exe -h hcalutca01.cms -p 15002 -s 294342 -u xml://cmsrc-hcal.cms:16010 -l INFO Stale XDAQ example: hcalpro 13038 4.1 0.3 2588468 122480 ? Sl Sep15 71:14 /opt/xdaq/bin/xdaq.exe -h hcalutca01.cms -p 16789 -u xml://cmsrc-hcal.cms:16010 -l INFO Do not kill things that are run under root like job control, controlhub, xaad, etc. Be careful with pkill and such that use the name xdaq since it will pick up those other ones as well. ~hcalsw/bin/dump_all.sh See if relevant card ispingable. If uHTR of AMC13 is not pingable: -Restart sysmgr: Using script on hcalutca01 machines: ~hcalsw/bin/restart_sysmgr.sh Basically ~hcalsw/bin/ contains all the magical scripts to do most things. Crate Locations and FEDS: http://cmsdoc.cern.ch/cms/HCAL/document/CountingHouse/Crates/Crate_interfaces_transition.htm Dump: ~hcalsw/bin/dump_all.sh ps aux (just to check what is there, if no stal xdaq it should be fine to just run: (sudo service xdaqd restart) sudo service xdaqd stop pgrep xdaq (just to check what is there) pkill xdaq sudo service xdaqd start THESE CHANGE QUITE OFTEN: Restart Services 904 I have begun working on service fix at 904. To run it, type: sudo -u hcalpro ~hcalpro/scripts/Service_fix.sh The currently functioning options are: -tomcat -sysmgr The possible functioning option is: -ccmserver Please use this and email/elog any issues or desired functionality Restart System Manager p5 Here is the start system manager command. It should be used sparingly. In most cases (such as swapping uHTRs), the system manager will not need to be restarted. sudo -u hcalpro sysmgr ~hcalsw/config_files/Sysmgr/sysmgrHCAL.conf Dan Arcaro 3:39 PM where is the alarmer xml? martin kwok 3:40 PM on hcalmon /opt/xdaq/share/hcal-common/alarm/ kakwok@xaas-hcal ~ > sudo systemctl list-dependencies hcal.target hcal.target ● ├─hcal.addon.target ● │ └─hcal.addon@kvm-s3562-1-ip151-94.target ● │ ├─hcal.spotlightocci@kvm-s3562-1-ip151-94.service ● │ └─hcal.tstore@kvm-s3562-1-ip151-94.service ● └─hcal.service.target ● └─hcal.service@kvm-s3562-1-ip151-94.target ● ├─hcal.b2in-eventing@kvm-s3562-1-ip151-94.service ● ├─hcal.bridge2g-sentinel@kvm-s3562-1-ip151-94.service ● ├─hcal.bridge2g-xmas@kvm-s3562-1-ip151-94.service ● ├─hcal.directory-service@kvm-s3562-1-ip151-94.service ● ├─hcal.heartbeat@kvm-s3562-1-ip151-94.service ● ├─hcal.heartbeatds@kvm-s3562-1-ip151-94.service ● ├─hcal.sensords@kvm-s3562-1-ip151-94.service ● ├─hcal.sentinelds@kvm-s3562-1-ip151-94.service ● ├─hcal.timeline@kvm-s3562-1-ip151-94.service ● ├─hcal.tracerd@kvm-s3562-1-ip151-94.service ● ├─hcal.xmas-admin@kvm-s3562-1-ip151-94.service ● └─hcal.xmas-slash2g@kvm-s3562-1-ip151-94.service 3:00 voila Dan Arcaro 3:02 PM uhh 3:03 still not sure what to restart martin kwok 3:04 PM last one: it’s slash2g on it: hcal.xmas-slash2g@kvm-s3562-1-ip151-94.service -- Main.DanielArcaro - 21 Jun 2017
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r8
<
r7
<
r6
<
r5
<
r4
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r8 - 28 Jul 2017
-
DanielArcaro
Home
Site map
BUCMSPublic web
Main web
Sandbox web
TWiki web
Main Web
Users
Groups
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
P
P
View
Raw View
Print version
Find backlinks
History
More topic actions
Edit
Raw edit
Attach file or image
Edit topic preference settings
Set new parent
More topic actions
Account
Log In
Register User
E
dit
A
ttach
Copyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback