- 积分
- 465
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
Ironic对接原生的Neutron' _) h/ v5 p# s! C8 w; n" @
部署、配置相关:
3 F- ?4 U5 E$ _- Ironic自己有一个dhcp-server,在inspect过程使用
- neutron-dhcp,在provision过程使用
- inspect和provision过程使用的tftp server可能不同
1 F" d6 w8 X0 Y% Q8 V4 h! n$ s 2 I& }0 b- g1 y2 G. i6 w
Register过程
* H3 B2 k9 g3 Z. ~; x用户录入ironic node,包含ipmi等信息
+ M% X5 L) y) P. u6 D3 t) q x
' y) `: X* d3 C. EInspect过程# I' d: `: q. V2 E* N0 ]8 ^
这个过程中使用Inspect Network,要求:$ `& E7 E9 p4 {. c3 ?( ^% T
- Ironic dhcp-server能收到BM节点的DHCP请求。
- BM节点拿到IP后,能和tftp-server-1互通(三层可达)
% @( t" b+ c+ |5 C/ r- U! n, ^
用户获取BM节点信息# E5 @7 [; C+ @0 E% Z
- Ironic通过IPMI设置BM节点PXE启动
- Ironic通过IPMI启动BM节点,做PXE启动
- BM节点从Ironic dhcp-server获取IP。此时BM节点的请求报文不带vlan tag,使用上联接入交换机的native vlan(默认tag=1)
- 拿到IP之后,BM节点从tftp-server-1下载小镜像(ramdisk,内含Ironic Python Agent)
- 执行某些操作,获取BM节点的详细信息
- 将BM节点关机。ramdisk运行在内存中,关机后丢失。
0 ~3 J7 ~7 m2 k, P! g
! s, w0 y- Q$ C: H% O; UProvision过程
' {! x- d8 g7 ?0 [+ A, N& p这个过程中使用Provisioning Network(由neutron创建),要求:
( ?7 H9 p4 s+ B* L4 [- P8 ]+ d) C- BM, glance-api, ironic-api, ironic-conductor, neutron-dhcp-agent需要保证PROVISION NETWORK连通性3 o/ i8 Q2 R( D4 I
用户申请物理机,安装操作系统,配置业务网卡等
7 f7 M8 b5 L( d) ^+ ?- 从nova入口
- Ironic IPMI启动BM节点,做PXE启动
- 此时,要求BM节点从neutron-dhcp-server获取IP(通过native vlan)。但由于Ironic-dhcp-server也允许native vlan过来的请求,所以必须保证DHCP请求被Neutron-dhcp-server处理。
- 拿到IP之后,BM节点从tftp-server-2(可以和Inspect过程中的tftp server不同)下载小镜像(ramdisk,内含Ironic Python Agent)
- (这一步怎么控制的?)从glance下载用户要求的镜像,做安装(要求拿到的IP和glance-api能互通)
- 安装完成之后,通过cloud-init在BM操作系统内部打上对应的vlan tag(必须保证该vlan tag在接入交换机上预先做了配置)4 Q) L& O3 N2 a4 t2 }' m) v3 _
' m- o! H6 v4 K" u' \
关键问题:
; a0 x/ p8 }* Z3 l' A# `- Ironic-dhcp-server和Neutron-dhcp-server都允许native vlan过来的DHCP请求,如果有两个BM节点同时做Inspect和Provision操作,可能引起冲突。! g3 @+ J) q* [4 I* G
- 两个DHCP server合并。但是Neutron-dhcp-server是白名单方式,而在Inspect节点,dhcp-server还不知道BM节点的信息,没法配置白名单。
- 严格将Inspect和Provision过程分开。在机房初始化过程中,开启Ironic-dhcp-server,做完Inspect之后将其关闭;或者在EPC上强制Inspect过程中,disable Provision操作。
$ D) N6 j8 ^6 `: |# C9 T& E0 A! Z2 y4 m3 [- Q& P" h
* 一级私有云中兴方案,将两个DHCP合并了,运行在ToR交换机上。" F7 y8 u3 Q# C1 D/ W
- BM节点的租户vlan一定要在接入交换机上预先配置,如果做不到,则需要动态地配置交换机
- Neutron-dhcp-agent需要在业务网上
9 c7 _& y, d+ j. ~: }, S& @4 T
; A! c( \. g3 G, K 苏州Ironic环境
- _* f( i# e) y. H- d10.142.24.12 root/@IDC_host4321
5 i0 Z/ L7 Y7 z2 D9 q' b 9 Z( N! u. c( O5 `
7 K# X/ m) q, u5 Z2 }5 i
浙江Ironic测试环境7 ~0 i1 B8 z, Q, \7 ]# M
) L9 F, L/ }* X; R6 J- b+ }2 t$ w
Ironic DHCP& U8 j, k5 M' u; y* `8 o
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhc+ {# Q/ q4 n M9 ^, O* ^$ y0 I
dhclient.d/ dhcpd6.conf dhcpd.conf1 x. |% r* S2 V6 L$ F
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhcpd.conf
# Z; z" ~/ w {! S5 k' ~5 c/ V. Coption domain-name "test.com";
1 e" w* i" T$ G7 s: |5 }: Hoption domain-name-servers 8.8.8.8, 61.88.88.88;( C2 o2 Q2 n, Y. l
default-lease-time 60000;7 I/ O. k+ e) G. E0 Q7 j. X
max-lease-time 720000;) `- U( y. j' } H5 k
subnet 20.26.34.0 netmask 255.255.255.0 {
5 z; v5 S3 s* {. ^/ H range 20.26.34.10 20.26.34.100; <== DHCP段
3 B6 g! k1 e$ `1 p6 _% i2 H option routers 20.26.34.1;# c; |+ {& ~6 A- N7 b8 |* _
next-server 20.26.33.26; <== tftp server0 a S3 O6 ?' \! r
filename "pxelinux.0";
5 ^" b B. t) M- z}
/ M) _. E8 F b6 Wsubnet 20.26.33.0 netmask 255.255.255.0 { <== conductor节点只有33.0网段IP,如果不配置这个subnet,则dhcp启动时会报下面这个错误6 U6 N+ p. L* [5 ?' B
}
2 G/ q! v& C" `) @4 B1 V: h- y% Q7 _* _6 y/ U% {
问题:
; w' P$ c1 o) A0 Y+ xApr 19 14:30:21 csv-yglcs17 systemd: Starting DHCPv4 Server Daemon...
4 Q$ `( k( E& E0 L8 m4 {Apr 19 14:30:21 csv-yglcs17 dhcpd: Internet Systems Consortium DHCP Server 4.2.5/ X& @0 b0 u1 y% A; r4 w
Apr 19 14:30:21 csv-yglcs17 dhcpd: Copyright 2004-2013 Internet Systems Consortium.6 h5 G7 p" u! f
Apr 19 14:30:21 csv-yglcs17 dhcpd: All rights reserved.
) R6 M# c2 Z% M6 U' m! d% xApr 19 14:30:21 csv-yglcs17 dhcpd: For info, please visit https://www.isc.org/software/dhcp/3 S! ?& l0 T( a* o, P7 j/ Z# b
Apr 19 14:30:21 csv-yglcs17 dhcpd: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file! X6 m6 l+ L1 J, L: H, h
Apr 19 14:30:21 csv-yglcs17 dhcpd: Wrote 15 leases to leases file., Q' P) P7 v. c$ Q3 T4 m, E$ B
Apr 19 14:30:21 csv-yglcs17 dhcpd:
! t; ~6 P6 d) u) B* a7 O& P: sApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno33557248 (no IPv4 addresses).
! }7 y4 v7 y, d" FApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno33557248. If this is not what
* g. c( z" m3 }1 wApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration# c1 h- v* _5 o+ U6 ?5 e5 ^
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
1 m& @. `' s" VApr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno33557248 is attached. **0 X. z& s8 { {. x
Apr 19 14:30:21 csv-yglcs17 dhcpd:
$ t o8 B# e3 G0 |! @Apr 19 14:30:21 csv-yglcs17 dhcpd:
- n. o, ]# Q2 E G4 }Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for virbr0 (192.168.122.1).* [; w) ]& z# G- m8 U
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on virbr0. If this is not what$ l9 E0 }* q4 V d( l0 i/ _
Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
@7 h: O) K6 a, n$ Q5 wApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
: Q8 u: d( q$ s1 X. uApr 19 14:30:21 csv-yglcs17 dhcpd: to which interface virbr0 is attached. **2 u+ Z, I0 c* n: g5 Z% t
Apr 19 14:30:21 csv-yglcs17 dhcpd:' f: p& f$ h* G% j0 F7 C) |
Apr 19 14:30:21 csv-yglcs17 dhcpd:
# l: L! D4 G! C% h+ h; F4 k0 cApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno16777984 (20.26.33.26).
1 s O! B6 S3 qApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno16777984. If this is not what
' Q! _& ^0 k' L, `+ ?Apr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
2 J$ Q; ~" \9 A- x( v; h2 ~/ wApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment$ I0 d4 c+ @! [1 b7 D0 i, A$ F
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno16777984 is attached. **8 J1 T5 L% t2 U1 J$ n7 O
Apr 19 14:30:21 csv-yglcs17 dhcpd:
* i7 s- L9 K* ?1 }: z' p$ ]Apr 19 14:30:21 csv-yglcs17 dhcpd:2 k7 k; E9 ]- f" W3 ?
Apr 19 14:30:21 csv-yglcs17 dhcpd: Not configured to listen on any interfaces!3 B4 o y0 u; `8 P, d
Apr 19 14:30:21 csv-yglcs17 dhcpd:# T: k! p/ p. I& R1 ?
Apr 19 14:30:21 csv-yglcs17 dhcpd: This version of ISC DHCP is based on the release available, V+ A2 Y' F+ P6 E1 F( x9 ^
Apr 19 14:30:21 csv-yglcs17 dhcpd: on ftp.isc.org. Features have been added and other changes
1 s @5 D3 A) }& bApr 19 14:30:21 csv-yglcs17 dhcpd: have been made to the base software release in order to make
* b# V+ e: T# N5 R5 }Apr 19 14:30:21 csv-yglcs17 dhcpd: it work better with this distribution.
( z Y- g9 c4 yApr 19 14:30:21 csv-yglcs17 dhcpd:. D7 Q+ F- J3 a" p4 O
Apr 19 14:30:21 csv-yglcs17 dhcpd: Please report for this software via the CentOS Bugs Database:( w3 ~6 c5 q9 R. W: \! u' r
Apr 19 14:30:21 csv-yglcs17 dhcpd: http://bugs.centos.org/$ r) K" m0 m4 g6 Q4 n6 J
Apr 19 14:30:21 csv-yglcs17 dhcpd:; N' i! ~, w/ ?/ c) ^ h3 |
Apr 19 14:30:21 csv-yglcs17 dhcpd: exiting.6 ]) k# Q* k) b& A. I) u! f, j
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service: main process exited, code=exited, status=1/FAILURE
4 j9 f( \$ }. t* ^ ?Apr 19 14:30:21 csv-yglcs17 systemd: Failed to start DHCPv4 Server Daemon.
& S6 [& j2 N$ r' CApr 19 14:30:21 csv-yglcs17 systemd: Unit dhcpd.service entered failed state.
' ]; c7 S' @+ NApr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service failed.
% M/ k0 \6 u2 T1 Y1 t
/ B; v% `1 J; ^7 Z2 H' K
. b3 n3 m& p5 L2 X/ CIronic Inspector
1 J3 w' U) z' ^5 _ T& E7 V[root@csv-yglcs17 pxelinux.cfg]# pwd+ {# ]0 _/ B( [ z# _2 j
/tftpboot/pxelinux.cfg
/ N9 P& Y$ l" f: Y[root@csv-yglcs17 pxelinux.cfg]# cat default. x* t' `8 S% F: w0 n/ j/ N
default introspect
% r) h$ W& Z$ Q- x6 [. w- S9 ]label introspect1 {2 T9 L" L. w0 i
kernel /tftpboot/ironic-inspector/inspector-kernel' Y0 n( d6 w6 q! T0 H( e
append initrd=/tftpboot/ironic-inspector/inspector-ramdisk ipa-inspection-callback-url=http://20.26.33.26:5050/v1/continue systemd.journald.forward_to_console=yes ipa-collect-lldp=True
* i7 E1 c% O. s2 r8 B& m* zipappend 3
- ?% P2 U- k a
7 {& u: f) c+ r8 ]9 s) j* i1 _inspector在20.26.33.26上
+ }' ]& T% G% A, P
/ t# ?9 P: c( s+ W; G2 H! ~Ironic Provisioning/ }7 T7 a+ ^. ]" R( r5 [4 R3 i
ironic.conf中的provisioning_network还没配置。还有cleaning_network。2 N/ m# H1 g) J, y
' x+ \$ m* R, b检查IPMI; F7 Y1 C$ F0 j: x: w( I
[root@csv-yglcs17 ~]# ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus power status
: G$ k. a% |1 q$ d$ SChassis Power is on
/ ?& X7 ^- J( ? q& q4 T( q' d( P! ~, D! l. G: o
0 N7 C0 N+ L/ c% X/ o& f0 Z
+ C* x; e; `$ [9 S) N == 操作 ==8 t' M0 Y. j# q4 s& G
( ]+ L6 _+ O6 g! m* F1 ^, i, B3 D% i! h9 `( S" {4 p
: j; Z$ E( }* }) H4 y
5 K- }% T! h+ J! V3 w' t
3 \$ w1 C+ l6 \9 g) ]! L. A5 l- P: @" N5 i' e( a) \) [ U4 t
$ B0 K: b' |( [* H9 Z& Z8 v" y2 u. Q
; Z" w7 S9 @' ^, C/ x, Q
% S0 C0 p7 P+ w) X/ e+ [+ }( Z1 P% D" F3 N( h! _
ironic node-create --chassis_uuid dbb588b3-75e8-4028-b851-110671e05e58 \3 Z" ~/ A4 w/ K1 [
--driver agent_ipmitool \
) O2 w5 y+ t3 i& y* e --name pc-zjnacthd01 \% n9 s! T( k* t& n3 m
-i ipmi_address=50.1.65.245 \
: V+ Z/ d3 e. V/ s; V -i ipmi_username=root \
) S, K) ]" R) n, Z& i -i ipmi_password=Huawei12#$ \0 o! z/ X( k* A, q. X
-i ipmi_port=623 \
3 f) M, j! V4 s4 g8 U4 j -i driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e \
Q, _ e+ Y2 R) d0 \* f% U -i driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
+ G4 R! O% }6 X! [$ f8 T" d+ |! f+ S% z
Update 5/25: 正在开发Ironic AZ功能,通过node-update将AZ属性加入node,同步给nova数据库。nova boot时只需要指定AZ创建机器即可。
9 ]" ^' u7 [% d% w; [4 n
' l; f2 s# p8 H* {
- M- r6 x5 i+ E9 E
$ M5 n, r' m3 V' F2 e! K
) `" `: ^7 D. x/ _
" B7 h5 m" s; X# ]& X% Y1 D
, A" `( A' A/ g$ W. aupdate 5/12:. `$ ^& @ S$ ~% K) c
7 q- W3 T( ^' f) V0 b( A% x+ j6 T
' Z) |! w. }$ i' i: q3 K( p
) F5 _1 I* Q+ `( N3 k0 i
$ T- A! f" {& F3 e4 L, z! P. ], Z; p+ m& a) [
4 \4 n1 w+ n4 rinspect成功之后:
/ k& Q$ }1 \8 E( c, p6 [6 g5 j( s l! j
& Q3 \/ z+ }& T* ?% e( x5 f, n
) _, R5 ?1 F9 N1 A$ b$ A4 S7 T& h% }) q; L1 g: i4 t
1 m$ w7 H$ |: C- I' o$ D4 }- ^9 J6 a6 h
4 I# ]% o% L ^# T) |% v+ p
8 E& N8 g; c! z+ ~* W2 T; h9 \
inspect失败,原因见“问题2”; e% x: e! L& d4 N9 ?, i
2 Y: B3 e# K- H( ?4 {: P- B/ K a& S' c
5 D) I, o" Q( ^+ R* Y
' ` U7 u# C: F0 N3 Y k$ y3 `4 T9 e, K2 k+ D" Q1 c9 C
/ U& @: W/ S' S- w
配置provisioning_network:
; r8 J5 |3 i0 Q. B1 ^; v! i0 P) M3 i7 O8 K" n
6 B. d$ L3 ]3 d% x$ b8 r) D3 J$ c
, i! F, U( n t3 X( c* x& k3 o
& ~" ]" k9 Q! v8 ]
+ j) F$ ]$ o, v( `9 n) N% ~' ^3 I+ Q' H/ B% e9 y
. D, l' \5 _7 n, O9 C- V% A
, i' p- P' \1 a! d. r5 c( I
, G9 M% S/ G4 i+ I2 t; {& O
: N5 ?, o6 ` K/ s+ c' c! h1 |Inspect成功之后:7 O% z( v* f( T- E0 \ I
) p$ u. g6 g& l
; c) t8 [/ d; \8 x( e/ M: e0 ^! N+ r' L9 C. @$ x8 b1 a
& g7 L" Z4 Z1 `7 B, l5 |6 K7 |% ?
- d$ c9 s6 e5 J! ]. ^
- W! M; O. q8 W$ \- a1 p# p" \
^+ n |! q/ `' y8 X3 t) @2 o0 a$ t( `
; ?! L$ O: b9 |! I0 H. j' N
9 {0 B) @, l4 V4 K- ?上传Ironic使用的镜像:
- ?+ Z( r8 h0 qglance image-create --name CentOS-7-64bit-ironic.qcow2 --disk-format qcow2 --container-format bare --file CentOS-7-64bit-ironic.qcow2 --is-public True --human-readable —progress- y& ]9 K$ e: a6 C
glance image-update 40928b81-9be1-402a-8684-4e2d2fcf330f --property hypervisor_type=baremetal
/ `2 d( ?1 O6 z3 e3 {* \ 2 f, M7 D% p2 S- q1 H* H& y
nova boot --flavor 2 --image 40928b81-9be1-402a-8684-4e2d2fcf330f --nic net-id=3a151049-ff3f-4bc5-88a1-b9084ec24bc9 pc-zjnacthd01- j* b2 ?0 G: o! A( ?
6 z B5 o) Q4 l& ?6 @( l
- {+ w1 P# _8 Q% P0 O* @4 k
# d. }0 \ V) T& ?. u) J
, S/ t: M* K8 w5 w3 ^) U2 P0 N" X* E, e, |( |' T' `5 C
== 问题 ==5 Q9 ?' O: ^0 x: L
- node name有限制?
7 s$ z7 ]0 y+ g" Q( j O
' K- ]/ h1 w1 Q$ @0 t/ r' |
/ t1 z5 b! B' t2 B' b k' L* w, m% v& B
/ c, ?* _. b" {
, v" n6 @' ]% e; Q, X* ~0 a- 第一次Inspect失败& l6 b3 Q& n" {# _
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 228, in url_for
0 S0 V6 b* Z7 e# t6 F2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main raise exceptions.EndpointNotFound(msg)
* f6 \" a; d* v: p/ }) ?2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main EndpointNotFound: public endpoint for baremetal service in RegionFour region not found7 v$ l) X& {2 b2 d5 k
; F/ Z" ?: w* }- V
重启ironic服务后解决1 V6 L" v4 ~, R4 i7 m
6 ]( A. K2 X O6 e7 k9 p/ K
- 第二次inspect失败,BM拿不到IP. a+ S# O; M8 p) X% _
DHCP请求已经发送到dhcp server:$ K, r5 l% o% Y! q
- P% l- r0 x2 ^# A) B2 [" }: `' P8 Q! n& O. K
+ t, S( w+ U! f7 f, N( N5 [# P$ z% A" E0 a
8 X7 I6 l& v s- inspect时找不到cleaning_network8 ?4 C6 g" p2 r! X+ N8 P
配置cleaning_network(=provide_network)! v1 E% N8 P; d% ~- ^! I5 ]* ^
* R- L1 J$ g& }) g$ |
- nova boot失败, conductor.log:5 j) _3 R( e: G, y( X- W B+ E
; L O, W, @; r
2 v7 c3 v# e+ `
: Y- p' M1 s; B( M
; Q, M5 ?, S' i* v. W更新控制节点的nova代码、ironic节点的ironic代码、计算节点ironicclient代码之后,问题解决/ x1 W& ]$ e6 ~6 d$ e g, G
% V- ^2 `( Q E2 [4 K
- nova boot失败,compute.log. j% O" s) r0 p
' \) ?! n0 r' q, W: A
; |1 s# ~& o/ J" j/ h) j! W3 s$ e
" u: m: c3 M! L: p. s2 A" [: s0 f8 o9 Q* D4 M& Z( q7 M, t( K
原因是这个ironic node driver_info还没更新:- q4 j% K1 v9 q
. O0 x* ] a( ~1 {5 k8 H
. e# Z) P$ { p. i4 }8 q
0 I% \0 x. @; Y$ H! _
+ T7 J) A* M' P4 u3 d更新一下:6 g& j* E# H& V% M9 D
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=f8205536-070b-4286-8d0c-35e3b8647741: |% _1 D) P7 K; r2 w
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=302e6438-4d31-429b-8bae-47e225d4ed679 O" u2 L/ t, y$ q$ B3 l, l$ L# ~
update 05/12:
6 @* e; n( c. b* k* ?7 |$ uironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e
# b0 J6 w) t& \3 ~: |* Pironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
0 |. w) S7 ?$ p1 Y7 Q$ L0 E7 ` * [: H% L: E+ {
8 u; C- o3 b t6 S' w1 @
: u N& r5 [) I" Z/ e }* U5 x/ f! r9 C7 n8 u' D* _; F8 d
$ f" J, Q( q- X( E1 r* b; |
- nova boot失败,镜像找不到,compute.log; ?5 [" x% G! N1 o; \& q! d2 S
; @0 V* k6 w( ~0 K% w$ H$ I0 `* g# h- Y; H$ ]1 J( K
& Z" S$ K) ^% _6 _! r
( q, n* v, w6 W: q) m6 j" X1 N
计算节点nova.conf的glance-api配错了:
) h6 {* l2 t9 ~, I# M
& V' {/ z, p; R9 Y r2 S/ n7 j; ? _5 D1 K X* w7 C3 C
" C/ T5 v7 Q. \& S
8 v9 Q1 I( Y) G$ @5 \& U$ Q
ironic-conductor节点ironic.conf中添加glance api version=1) x- F! Q- u+ w1 D6 D# R0 \+ ]
/ f( D( W( w' q r! ^! m& \/ ]1 ^
( n! o- C0 e- g! ^1 }
+ W3 c( ]2 ~( y. }4 n
- b' l- h V& x3 @4 V) p* z/ u4 }' x4 a9 e
* e' ?$ r; U) i
glance_api_version=14 K4 s$ c4 J6 C
3 p2 L5 U, p) D' J/ @5 W
- nova boot失败,ironic-conductor.log:$ J* \/ P5 g1 A5 T. G
. ?$ ?8 e; `( g) l' f* G8 Q/ a9 P, T
9 D/ n9 {- Y5 m j, }$ q, x0 M; k
3 W4 j/ p: T2 K' U# L, x$ n u% s+ ?命令行验证,可以在provisioning network d5a284c3-41d3-4eb3-a11f-58a99d3e2eb1上创建port
; J. `- c) l: M6 m1 q1 [% P+ @0 w
原因是没有enable LLDP。enable之后:, f; q6 B. U& t% {1 h, v
# T9 R% C6 w! `3 s1 n# R: iironic port-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic port-delete 'X'
. |. _) [! _' V. D. ^& T% _ironic portgroup-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic portgroup-delete ‘X'
1 f4 c6 [; z" [7 g" N5 A重新Inspect:4 T: ?/ e; S& M# x+ w9 q
ironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 manage
" `2 x' c3 W+ I, n/ Q, aironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 inspect" G) g+ h( E9 A/ w; S0 R
# [0 B8 O$ T2 [
, J" a! b6 J7 |
1 F" e* i5 I, x! h% W. x# }1 n" c# ~' G. k5 y
0 n. K& c, K, [9 I: e' t e) t2 ^" {4 m/ R' Z
9 v" N* S1 b1 a
# w% }- o% y/ q7 S1 n# o$ @
; u* A5 l! z: s5 f6 ?7 w
8 i3 ~1 ^1 K7 s1 c5 J4 i- nova boot失败,找不到用户镜像
4 g; r$ m' R& O( Y. ~ 原因是glance-registry.conf中的数据库写错了。$ f5 o! A4 H! `6 F. W O" d
- c+ f& j+ `1 w/ l1 a
- nova boot失败,找不到ramdisk
* r- L3 e/ l8 n8 e4 Y ' N: w% a) I6 t- u8 p" f
/ Y: G/ U7 y% a- E% \4 Y3 s2 r
, I+ @# {& e/ l# A) d& t; l9 f( f9 G% a* A+ }
" U; e0 j2 L; e这个image UUID是配置在ironic node的driver_info里面的,image需要上传到glance* a& Q1 {* \' U
5 z! H1 h) q5 h, d
上传镜像:
P9 p# ~- o5 D# d- C+ G0 A% F
1 }3 I Q" |3 O1 H3 [1 k$ ]( J8 `- v, k# {( T3 r: X
( S, m" M/ B" a
6 k- X4 I/ O8 W! l% F: I8 d& e+ R# [/ x* V# K& \: r
, v* {' J/ f, l1 G5 _
$ r+ h7 Y# E0 h1 S$ `; D8 I
2 o4 d6 _6 [: ^9 @$ |
" ?4 E, l9 c2 ?; D1 O: s# x
6 c% L5 D6 F7 C/ B! \6 w' ^
7 o! a- y7 O, E; `: j* a% @ q
! d0 J* i7 N4 s9 E8 Q3 _/ p9 k
1 T9 [7 k* |" \ t" z
更新Ironic node信息:
/ b7 _& l4 `: t/ Z5 |
x- w& j6 Z( K' t2 f& O, D3 V4 ^; ?2 c$ X! t" c
: F' V9 V% h$ M) Z' {1 R
7 A& @: w( J7 K. Y9 [* N3 c' k/ t+ `4 r, ^
- nova boot失败,访问tftp权限不够
/ y* f( c( s5 g( a( E& k 7 V6 I; K7 {& p8 ?; K+ Q2 f
3 f/ Y; J+ X$ }$ N! W4 x# f
, R# F8 g# x V/ g1 V
8 H7 l/ f8 C( |5 y2 T- A: [- D* Q3 h
3 A, q2 }- z0 cchown -R ironic:ironic /tftpboot/
9 T$ f9 k8 w& a8 ^% g5 j4 l
4 q+ \' `8 ?/ B$ o$ N9 k3 H( X: e2 m
3 g$ d( p. L2 S, \1 j/ u; o7 L, x7 {5 d! ?" Q/ K A
) n3 v) q- {3 F; S3 D S
E" _! Q3 i; w7 G- nova boot失败,物理机DHCP请求被ironic-dhcp捕获了
& W& m I) ?* ~' N8 z) @2 w 关闭ironic-dhcp/ Y; M; x9 A0 T
% j+ I4 ~/ P* y/ R( \- nova boot失败,物理机DHCP时不能从neutron DHCP拿到IP
! g T. k' D1 u) l 在控制节点上,neutron dhcp在dnsmasq启动的namespace中。relay的目的地址是控制节点管理网IP(eno16777984),dnsmasq的监听设备为namespace的tap口,IP为20.26.34.91,他拿不到dhcp请求。
8 q! Y0 t9 c" [3 o9 E0 Y现在的方法是:在控制节点上手动启动一个dnsmasq,使用neutron dhcp一样的配置$ i# n4 @0 i7 g& Z U+ e
# ?: i: s" Z2 ~* Y% l B2 b2 q$ e- 拿到IP之后,进入ramdisk系统,但是重启之后不能进入用户镜像的操作系统" V5 w6 V. ?. E9 A1 b/ {6 [' k
查看BIOS的启动设备顺序,发现是- Boot Device Selector : No override
1 Z0 P) I5 S! n5 S* V# ^6 t. y$ ^查看ironic-conductor.log,发现连不上20.26.34.70:9999。这是IPA的地址和监听端口,需要保证ironic-conductor节点能连上,但是的确不通。
/ P! l8 R8 m$ F+ \. E' Y# o
, q6 O2 g3 H9 D8 q( W( x# L
( n: F9 V( g4 `' U* N! d2 r( Z
. X" l- q7 n( f( V: m/ U. o, M: k2 s, q8 S C9 W0 }; w7 J6 `$ _4 k
; w- `1 L+ @5 b# S4 l( O0 j
姚军说可能是ramdisk启动之后,有两个网口获取到了IP地址,引起路由错乱,建议我们ramdisk启动之后,删除第二个地址。9 m1 m) V( r$ n9 d7 ~# x
8 e( P! ~8 E* V05/04 update: 在provisioning network上加上静态路由:destination=控制节点网段,nexthop为provisioning network GW g9 l/ X+ ?. T: U
05/11 update:neutron subnet-update aca03dd8-3d2a-4c54-99de-7a8a7bac4f53 --host-route destination=20.26.33.0/24,nexthop=20.26.34.1' w1 S& _9 M% V. F
Updated subnet: aca03dd8-3d2a-4c54-99de-7a8a7bac4f53
: P! \- `6 H. j% @* r
, s4 S0 F; I6 b
/ l' a+ {7 B: B) A0 s5 x3 X- t
~" {2 a S, ?8 |: r `3 ^" ^9 \5 @9 \; t2 c/ n3 ~
+ h# B7 a) }. W' I& Z
验证可行,能连接这个端口并下载用户镜像: ==> 为啥会有多个网卡获取到IP,如何从代码层面解决?
! @$ l+ c8 o# z2 A2 X, Z
8 n5 H3 Y7 j4 a0 ?' Y2 j8 A
, a' g% A0 ?# A3 U/ a7 p F9 ~& s0 Y7 m8 R7 U% {6 @) E) @
, T0 y& Z% h3 B$ b1 c' v
9 f9 O8 T- x: u; M. x8 m X
% t. x0 w9 ~6 D8 F( y3 WIPMI查询启动顺序:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootparam get 5
; }( g0 ?: T$ Z设置硬盘启动:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootdev disk, E1 ^' V* ^, i# l* a
# A: s& ~9 u4 D2 h
, G) ?9 X5 A- Y; w3 s, h; O- Q# N
) }1 A9 p. r5 [, O9 _3 W1 g" p. K+ k5 M$ F2 l: ^9 p1 O+ c4 V
3 @& Q, r2 @5 ^1 U/ h1 O8 ~! p
- 用户镜像下载到了/dev/sdl,没有下载到第一个硬盘,并且整个boot过程超时了: i ~+ G3 s; @! i7 Q7 E
7 z0 D* w( J1 M2 N. i; Z# P
& n4 S: D4 M1 F+ i0 |
; g1 [0 W5 c; T
! y. u" \3 u1 v* [& t3 i a. 姚军修改了ramdisk,固定使用/dev/sda作为写入的硬盘
7 w! ^( `9 B. l; ?1 ?) s b. 修改ironic.conf的deploy_callback_timeout=900
( [9 i3 I! ~2 K; ~. z* k
* B% B D+ A( b' rUpdat 05/04:
0 x' P+ u- q) ]- P" m7 j$ W. L李灏:ironic node-update 4fae2ae3-0935-4585-8be2-00298015f8f3 replace properties/root_device='{"name": "/dev/sda"}'
U& f. E1 e) }, a( W3 s m; R3 f% G. c' A% P) f
- 写入了/dev/sda,但是ironic-conductor没有重启机器,导致boot hang死
. e# ] F: s2 k/ Q8 L journalctl -fu python-ironic-agent查看IPA内的日志
, d W! G7 N4 Y2 `0 Ojournalctl --no-pager
0 R; F) z O$ R" i4 L: i
) X/ x. C1 y$ G. I5 l7 ]- 镜像写入/dev/sda后,IPA执行partprobe /dev/sda失败
$ z, c% J" @- z) \3 ^ % E5 g" l+ Q6 ?2 i! X
( Q) P( @' W) |; j
4 ^ ~( q' Y! n0 B" c
' T6 I1 E: ?8 w. X9 k$ h) D1 ?" {ramdisk中的ironic-lib需要打patch:https://review.openstack.org/#/c/444061/
/ u0 F. @9 j1 w' |2 U) Y3 T$ s K8 k. V
5 P( @+ x* }+ e4 A- J M8 d
|
|