środa, listopada 02, 2011

Straciłem zaufanie do Solaris Cluster

root@sx:/builds/ohac/usr/src# /usr/gnu/bin/grep -c -r XXX * | grep -v ":0"
cmd/scshutdown/scshutdown.c:1
cmd/mdm/rpc.metacld/common/mdc_init.c:2
cmd/dcs/run_reserve.ksh:1
cmd/dcs/scvxvmck.cc:2
cmd/dcs/reserve.cc:5
cmd/dcs/scsi.cc:2
cmd/dcs/scgdevs.c:2
cmd/scqsd/QuorumServer.cc:1
cmd/ha-services/hafoip/hafoip_start.c:1
cmd/ha-services/hafoip/hafoip_prenet_start.c:1
cmd/ha-services/hafoip/hafoip_stop.c:1
cmd/ha-services/haevent/event.c:1
cmd/ha-services/hastorageplus/hastorageplus_common.c:1
cmd/ha-services/hascip/hascip_init.c:1
cmd/rpc.scxcfg/scxcfgd_svc.c:2
cmd/rpc.scxcfg/scxcfgd_ccr.cc:1
cmd/scconf/scconf.c:1
cmd/cl_apid/cl_apid_main.cc:1
cmd/cl_apid/caapi_event.cc:1
cmd/cl_apid/caapi_mapping.cc:3
cmd/cl_apid/caapi_reg_cache.cc:4
cmd/cl_apid/caapi_event.h:1
cmd/cl_apid/caapi_mapping.h:1
cmd/cl_apid/client_reg_info.cc:2
cmd/ssm_wrapper/ssm_ipmp.c:1
cmd/ssm_wrapper/ssm_call.c:5
cmd/ssm_wrapper/ssm_net.c:2
cmd/ssm_wrapper/ssm_ipmp_callback.c:1
cmd/cmm/pmmd/pmmd_impl.cc:10
cmd/cmm/pmmd_adm/pmmd_adm.cc:3
cmd/cl_eventd/cl_eventd.cc:3
cmd/pxfs/clexecd.cc:4
cmd/replctl/replctl_main.cc:1
cmd/did/didadm.c:3
cmd/rpc.clquery/clquery_device.c:3
cmd/rpc.clquery/rpc_queue.c:8
cmd/rpc.clquery/rpc_manager.c:2
cmd/rpc.clquery/clquery_network.c:3
cmd/rpc.clquery/clquery_main.c:3
common/cl/repl/repl_mgr/rm_repl_service.cc:13
common/cl/repl/repl_mgr/rm_state_mach.cc:8
common/cl/repl/repl_mgr/cb_coord_control_impl.cc:4
common/cl/repl/repl_mgr/repl_mgr_impl.cc:4
common/cl/repl/repl_mgr/dependency_mgr_impl.cc:1
common/cl/repl/repl_mgr/service_admin_impl.cc:1
common/cl/repl/repl_mgr/dependency_mgr_impl.h:1
common/cl/repl/rmm/rmm.cc:3
common/cl/repl/service/replica_handler.cc:4
common/cl/repl/service/repl_service.h:2
common/cl/repl/service/multi_ckpt_handler.cc:1
common/cl/repl/service/repl_service.cc:3
common/cl/libuos/uos_misc.cc:1
common/cl/libuos/cladm_impl.cc:9
common/cl/clprivnet/clprivnet.h:3
common/cl/cmm/membership_engine_impl.cc:33
common/cl/cmm/automaton_impl.cc:4
common/cl/cmm/cmm_comm_impl.cc:1
common/cl/cmm/callback_registry_impl.cc:1
common/cl/cmm/ucmm_comm_impl.cc:22
common/cl/cmm/membership_manager_impl.cc:52
common/cl/cmm/ff_impl.cc:8
common/cl/cmm/cmm_impl.cc:1
common/cl/cmm/ucmm_api.h:1
common/cl/cmm/ucmm_impl.cc:8
common/cl/pxfs/lib/pxfs_misc.cc:1
common/cl/pxfs/client/pxspecial.cc:3
common/cl/pxfs/client/pxvfs.cc:12
common/cl/pxfs/mount/mount_server_impl.cc:13
common/cl/pxfs/mount/mount_client_impl.cc:21

Podobnie można zrobić dla " bug " i " workaround ", można jeszcze poczytać:

// XXX Pass by reference of cast value doesn't work with SC5.2
// XXX Need to file a bug, in the mean time we can work around
// XXX it by passing in the proper type and cast it back.

Sterowniki klastra będące modułami jądra Solaris pisane są w C i C++. Dodatkowo kod kompilowany jest w trybie debug, a wynikowe obiekty podlegają zmianom strukturalnym. Część funkcji dla klastra jest już w jądrze Solarisa, a moduły je przykrywają. Po przejściu problemu w cl_boostrap z inicjalizacją w nv151  pojawiają się dalsze awarie wynikające z tego, że Sun Cluster został skompilowany innym kompilatorem (z innymi CFLAGS) niż system. Mamy w jądrze kod i ABI strasznie wrażliwe na zmiany kompilatora. Klaster dodatkowo grzebie w 3-ciej warstwie modelu OSI i może nie być kompatybilny ze wszystkimi sterownikami kart sieciowych.

CTF is generated from stabs, and stabs are only generated in debug mode (-g). Unfortunately debug mode has side-effects:
- It disables the inline generation of functions and this can have a major  impact on performance. The C++ compiler provides the option -g0 that provides the same feature as  the -g option but does not disable inlining.
- It enables variable globalization which means that static local variables  will be defined as global instead of local and will have their names made unique by prepending a file specific global prefix. This increases the size of binaries and can break mdb debugging modules which are using some static  symbol defined in Sun Cluster.
The Sun Studio 10 and 11 C++ compilers have no option to disable  globalization. RFE 6289358 is opened get this option. While this option is  not available, globalization can be disabled by using the -G option of the ctfconvert binary.
- On sparc, it disables tail-call optimization so this creates less optimized  code and this can impact performance. The C++ compiler provides the option -Qoption cg -Qiselect-T1 to re-enable tail-call optimization.
The debug mode has no other known side-effects. Some other optimizations which may have been disabled will be re-enabled by the -xO3 optimization level.

0 komentarze: