- 积分
- 465
在线时间 小时
最后登录1970-1-1
|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?开始注册
x
Ironic对接原生的Neutron0 b/ ? S& _& N, O" Y9 ^0 g
部署、配置相关:
5 G1 e( E/ p, f L( q+ B' M- Ironic自己有一个dhcp-server,在inspect过程使用
- neutron-dhcp,在provision过程使用
- inspect和provision过程使用的tftp server可能不同* h" I" t' J: B* u( d) O+ Z# O
" d6 c7 t9 v1 v/ N, PRegister过程$ U3 Q- A& k; s- Z- J- m
用户录入ironic node,包含ipmi等信息4 s" b2 ^ V5 e* O$ y
; U( |5 y9 t/ i; F1 `% a: _
Inspect过程) ?8 S9 l& H: e% c; e* H2 R1 U
这个过程中使用Inspect Network,要求:
8 s% @# G1 F2 m0 a0 P- Ironic dhcp-server能收到BM节点的DHCP请求。
- BM节点拿到IP后,能和tftp-server-1互通(三层可达)+ N. y# z' t: e# f( b% C" L
用户获取BM节点信息
, z# n; ?' ~* _1 d - Ironic通过IPMI设置BM节点PXE启动
- Ironic通过IPMI启动BM节点,做PXE启动
- BM节点从Ironic dhcp-server获取IP。此时BM节点的请求报文不带vlan tag,使用上联接入交换机的native vlan(默认tag=1)
- 拿到IP之后,BM节点从tftp-server-1下载小镜像(ramdisk,内含Ironic Python Agent)
- 执行某些操作,获取BM节点的详细信息
- 将BM节点关机。ramdisk运行在内存中,关机后丢失。
9 j- B( b/ z1 H7 y; P / c) C2 e* r( m- F6 E% J
Provision过程: R4 ~9 P5 g7 d% {! p1 z6 d: T
这个过程中使用Provisioning Network(由neutron创建),要求:
: `/ j0 W" v5 X+ A: p0 D& ?- BM, glance-api, ironic-api, ironic-conductor, neutron-dhcp-agent需要保证PROVISION NETWORK连通性7 Y2 ^9 n- Q; c' ^ f2 E" `- v& i4 M
用户申请物理机,安装操作系统,配置业务网卡等7 l; c1 K; B5 f; f P; n
- 从nova入口
- Ironic IPMI启动BM节点,做PXE启动
- 此时,要求BM节点从neutron-dhcp-server获取IP(通过native vlan)。但由于Ironic-dhcp-server也允许native vlan过来的请求,所以必须保证DHCP请求被Neutron-dhcp-server处理。
- 拿到IP之后,BM节点从tftp-server-2(可以和Inspect过程中的tftp server不同)下载小镜像(ramdisk,内含Ironic Python Agent)
- (这一步怎么控制的?)从glance下载用户要求的镜像,做安装(要求拿到的IP和glance-api能互通)
- 安装完成之后,通过cloud-init在BM操作系统内部打上对应的vlan tag(必须保证该vlan tag在接入交换机上预先做了配置)" F) s' ]6 j! Y9 G; l# [4 [
' n+ f* w5 J" c 关键问题:& k: r7 P* ]% @; ^
- Ironic-dhcp-server和Neutron-dhcp-server都允许native vlan过来的DHCP请求,如果有两个BM节点同时做Inspect和Provision操作,可能引起冲突。+ w8 h! i5 m7 K0 t1 C- u! ?4 f
- 两个DHCP server合并。但是Neutron-dhcp-server是白名单方式,而在Inspect节点,dhcp-server还不知道BM节点的信息,没法配置白名单。
- 严格将Inspect和Provision过程分开。在机房初始化过程中,开启Ironic-dhcp-server,做完Inspect之后将其关闭;或者在EPC上强制Inspect过程中,disable Provision操作。
4 z5 L6 I7 I& P1 C& a
7 a$ `7 C- M/ V; \% K& s" c- O* 一级私有云中兴方案,将两个DHCP合并了,运行在ToR交换机上。
" h6 ?7 c5 A. N* t- y
- BM节点的租户vlan一定要在接入交换机上预先配置,如果做不到,则需要动态地配置交换机
- Neutron-dhcp-agent需要在业务网上
! l6 T) G. e9 D8 A) U' r0 w5 f+ _2 | 3 c- b- n; d m2 v7 `
苏州Ironic环境$ O' V. E- a4 t$ D# o, L& I% |
10.142.24.12 root/@IDC_host4321/ v9 R5 V& p8 C$ M" H: P7 r% \
, M2 Z0 n8 d( H @
. Y( N4 T3 T: R2 a/ k
浙江Ironic测试环境 f3 {2 P* D9 q& M6 R2 S3 Z
/ ~5 o% ~+ L+ w3 C$ uIronic DHCP/ M8 z9 p3 y* l) G$ U# }* X
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhc
9 Y6 }2 V& J9 s( V* M( edhclient.d/ dhcpd6.conf dhcpd.conf2 y. K: M! n# @% x: e" R. f* Y
[root@csv-yglcs17 ~]# cat /etc/dhcp/dhcpd.conf, Y; Z4 X% r5 n1 {8 H7 w
option domain-name "test.com";
; Y4 C* E1 j- o( E- Boption domain-name-servers 8.8.8.8, 61.88.88.88; D# w+ M& R Y$ _0 L v
default-lease-time 60000;) t5 a: [+ a8 }* \" F
max-lease-time 720000;
- A D. J# y3 I Z: }8 j/ v7 xsubnet 20.26.34.0 netmask 255.255.255.0 {+ H# d" B' ?: s: l
range 20.26.34.10 20.26.34.100; <== DHCP段
2 |: ~$ {5 A p& S: Q option routers 20.26.34.1;
6 o: {( ? Q4 W( u next-server 20.26.33.26; <== tftp server
4 D5 r5 f- o8 c, k1 h2 Y8 W filename "pxelinux.0";
" H9 p' U! K; K' q$ t}
2 f) q6 h8 P; P. dsubnet 20.26.33.0 netmask 255.255.255.0 { <== conductor节点只有33.0网段IP,如果不配置这个subnet,则dhcp启动时会报下面这个错误
% n2 c2 B3 M: x# q4 X1 D}
3 P+ V. B- n. e3 ^. E: }2 }* F3 N( l% W8 N
问题:9 D$ V+ j2 _6 I2 P$ M) s
Apr 19 14:30:21 csv-yglcs17 systemd: Starting DHCPv4 Server Daemon...; o }) A8 i S
Apr 19 14:30:21 csv-yglcs17 dhcpd: Internet Systems Consortium DHCP Server 4.2.5
; H$ s7 `' i4 Y( t" ~/ s1 nApr 19 14:30:21 csv-yglcs17 dhcpd: Copyright 2004-2013 Internet Systems Consortium.% y' ?% H" x$ |3 r7 ^/ f- g
Apr 19 14:30:21 csv-yglcs17 dhcpd: All rights reserved.
5 _" H) `# N2 {0 U; u% [Apr 19 14:30:21 csv-yglcs17 dhcpd: For info, please visit https://www.isc.org/software/dhcp/* |$ V4 j' \" ?' k7 r
Apr 19 14:30:21 csv-yglcs17 dhcpd: Not searching LDAP since ldap-server, ldap-port and ldap-base-dn were not specified in the config file
4 a p% X- u1 ^Apr 19 14:30:21 csv-yglcs17 dhcpd: Wrote 15 leases to leases file.4 c& N) _$ C; ~! H
Apr 19 14:30:21 csv-yglcs17 dhcpd:9 l7 O; J& U6 j; j% ^% P/ W
Apr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno33557248 (no IPv4 addresses).
7 T2 D( E% ]; \* T) LApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno33557248. If this is not what
- d& \# [* H' r9 YApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration! }3 X Y6 B9 ^ G n8 M2 l$ ] h
Apr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment9 I9 f0 x8 ]$ W; q0 |4 b
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno33557248 is attached. **
1 V# j- B1 \ I; x( yApr 19 14:30:21 csv-yglcs17 dhcpd:
. s1 m( V- [9 A, uApr 19 14:30:21 csv-yglcs17 dhcpd:
k! t$ v3 _4 Z# lApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for virbr0 (192.168.122.1).3 r% s- c) q2 o6 g6 G& e
Apr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on virbr0. If this is not what
& f0 }3 S Z& q* w e8 P( j- zApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
2 S+ m; d4 ~4 ^; K% h+ A3 \: |' w* LApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment7 f4 ?* f/ I7 ~ U4 y% \
Apr 19 14:30:21 csv-yglcs17 dhcpd: to which interface virbr0 is attached. **: b' a3 B2 A$ u! F8 p. {
Apr 19 14:30:21 csv-yglcs17 dhcpd:
. U9 r+ v. b/ VApr 19 14:30:21 csv-yglcs17 dhcpd:
2 F. X- S1 e+ RApr 19 14:30:21 csv-yglcs17 dhcpd: No subnet declaration for eno16777984 (20.26.33.26).
" K; C( V0 Z' I2 r" t. LApr 19 14:30:21 csv-yglcs17 dhcpd: ** Ignoring requests on eno16777984. If this is not what
/ O2 I% [( p- D1 p. F/ qApr 19 14:30:21 csv-yglcs17 dhcpd: you want, please write a subnet declaration
$ O+ A3 R; j' EApr 19 14:30:21 csv-yglcs17 dhcpd: in your dhcpd.conf file for the network segment
8 ?' P- ?! n. U7 Y# y7 }0 iApr 19 14:30:21 csv-yglcs17 dhcpd: to which interface eno16777984 is attached. **
/ K$ z! b3 A3 F3 `& QApr 19 14:30:21 csv-yglcs17 dhcpd:
# ?$ A4 Z! Y* H, \; _. HApr 19 14:30:21 csv-yglcs17 dhcpd:
B3 ~* U) E; O+ K8 v6 rApr 19 14:30:21 csv-yglcs17 dhcpd: Not configured to listen on any interfaces!
, J5 ]9 l! N( C$ GApr 19 14:30:21 csv-yglcs17 dhcpd:
2 s0 K. U- z) g, a6 S4 h9 [Apr 19 14:30:21 csv-yglcs17 dhcpd: This version of ISC DHCP is based on the release available
. u. h2 g1 G8 O; Q( WApr 19 14:30:21 csv-yglcs17 dhcpd: on ftp.isc.org. Features have been added and other changes
# O! a) n6 L; j& g( S9 cApr 19 14:30:21 csv-yglcs17 dhcpd: have been made to the base software release in order to make; l* X' K% d, u
Apr 19 14:30:21 csv-yglcs17 dhcpd: it work better with this distribution. p: }! s; j. A" D3 J2 L0 J
Apr 19 14:30:21 csv-yglcs17 dhcpd:
) r b6 |7 ~: M' u+ Q0 B. |Apr 19 14:30:21 csv-yglcs17 dhcpd: Please report for this software via the CentOS Bugs Database:5 Q% {3 J% `) l g- W
Apr 19 14:30:21 csv-yglcs17 dhcpd: http://bugs.centos.org/
: A9 x% G# n) ~. W- MApr 19 14:30:21 csv-yglcs17 dhcpd:
0 g. G2 m4 D9 @. m! `0 D/ \Apr 19 14:30:21 csv-yglcs17 dhcpd: exiting.
" s* o, E8 h. i' p: J; q4 D9 u' rApr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service: main process exited, code=exited, status=1/FAILURE* K/ _- Y2 l: G7 Y p: R
Apr 19 14:30:21 csv-yglcs17 systemd: Failed to start DHCPv4 Server Daemon.. G- V7 e2 C8 D9 @& w3 a/ f
Apr 19 14:30:21 csv-yglcs17 systemd: Unit dhcpd.service entered failed state.4 p$ q8 Z X, y
Apr 19 14:30:21 csv-yglcs17 systemd: dhcpd.service failed.# e G3 u' Z5 T6 h
* c1 U% u' Q0 { [
( T6 K) s( L/ O9 P: W6 {Ironic Inspector; Z* c, Q, v2 f4 v
[root@csv-yglcs17 pxelinux.cfg]# pwd; g4 [& N. {6 H/ A5 [
/tftpboot/pxelinux.cfg, D7 T' s0 o- q4 Q$ g% \
[root@csv-yglcs17 pxelinux.cfg]# cat default
3 R3 `$ F6 U- h! y) a: kdefault introspect, G; J# [/ ]! G3 w8 C9 f
label introspect) t* P6 m/ {" B# N$ @) t( g
kernel /tftpboot/ironic-inspector/inspector-kernel2 t2 m, N9 G; }9 o- i. Z% Q
append initrd=/tftpboot/ironic-inspector/inspector-ramdisk ipa-inspection-callback-url=http://20.26.33.26:5050/v1/continue systemd.journald.forward_to_console=yes ipa-collect-lldp=True
: o1 G3 l. {% K5 a; O1 }' cipappend 3; M- s2 P" T% W' u7 h
9 ? `3 D |+ O- W8 x$ R# einspector在20.26.33.26上
2 o0 f2 k& q% Q1 ?! y( i3 m6 Q k 4 K& [, M* f6 b
Ironic Provisioning/ f3 A7 G( d' P* l
ironic.conf中的provisioning_network还没配置。还有cleaning_network。
! ~! l; R; d$ d; U R+ W
: r; T8 v- W- o3 x9 }检查IPMI
+ Y0 L0 n' `2 x [5 Y[root@csv-yglcs17 ~]# ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus power status
+ c+ u+ f7 u3 Z! o2 mChassis Power is on
' d; b l3 L6 e4 `, R) _ B
9 x, q$ m1 w, j' R
$ g& ~' b; l# n: G7 C4 ~) l6 t8 J) X4 [( l+ C
== 操作 ==3 z: ^, ?' q- l( m$ f
# L0 L; N0 |$ b; h. {+ x1 d. G9 r2 s* F* G# U9 p6 Y+ [' |& Y& Z
0 T( P" i5 h1 i6 F r% d
7 ]7 e- J" Z I0 ^
6 S; z) m9 E' l, r8 f4 O9 A% _1 a, M' W$ R8 k; j
& P, V# n+ f# ?$ i8 F Y
+ r$ ^& {( r. t' C' Y; r8 z
/ f* ?! e! \9 f; ]1 l3 ~6 h
# q$ A$ C& X; k8 ?! Z) V* l6 m
ironic node-create --chassis_uuid dbb588b3-75e8-4028-b851-110671e05e58 \
# Z+ m3 C# {+ o1 L0 l --driver agent_ipmitool \& G& S0 e- k! F' N% Q8 L
--name pc-zjnacthd01 \6 M H* ~' j, J
-i ipmi_address=50.1.65.245 \" d7 u! S: f* A! H6 j
-i ipmi_username=root \5 c: y9 b, Z0 E; `+ M9 ]
-i ipmi_password=Huawei12#$ \
3 D% ]/ T. ]/ X- g1 e -i ipmi_port=623 \* C+ v0 n" c3 d9 V9 Z! Q
-i driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e \) D5 j, ^; O$ \$ D f
-i driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
4 |3 D- m% B. w# \- D6 u- j* j$ M. R2 m
Update 5/25: 正在开发Ironic AZ功能,通过node-update将AZ属性加入node,同步给nova数据库。nova boot时只需要指定AZ创建机器即可。
, p! ?4 \6 o U `8 B7 P% U0 l; e3 S5 Z( l: {2 h# o) V
& h# H' Y, n3 y8 n
1 v# d. r8 u" ]% L( B9 S
( L' v& z" b* v0 @6 p
1 V" n; y, C9 v
( ~3 W4 L/ u e4 ~* v {- N: n, w7 fupdate 5/12:+ {4 Y3 w6 E( J) L& ^, z
/ r& R+ i0 m+ j6 _7 L0 v( l, ~
. j, {) j$ ]1 I( f
! g) R( R2 t) T. J- M: v7 W2 V# h O5 I) {+ E+ h( R
* A2 y9 {1 i7 y$ H+ }
! O6 i& r( N4 V2 A( ]inspect成功之后:, B/ ?- t* Z$ H) n- E% R# \: P1 Q
" {& m) T/ V7 K I: d% Y. H! i
: r. g Q! d1 B
5 V5 a, T. G* D! N; U" v
2 Z3 G: k/ J+ M, m- y . I$ m) h o7 Z* F# r5 q! z
- V) ?# h8 U8 F! P0 o: _; q
. T2 q+ E7 \: s' u: pinspect失败,原因见“问题2”
+ x. K4 O3 ?* I" r
4 M% G/ B7 i' ~7 u, ]& X" Q( t' L: M2 |! K( j
% t" P. h2 ?! H% t5 \+ U& d3 V: C' X
3 G# |2 k- c) Y/ D( r6 J4 |1 I5 J: L0 S
2 H/ m& N- X! n" f
配置provisioning_network:; E' Z5 o0 t' M/ F+ z9 |2 P3 o
. I% P2 i p& f! P# _( H
6 \. t; v* v& l5 E% f: o4 z
- U7 q4 a0 Z! V' ?& e$ x( }; A( n5 I! b6 n ?5 V
/ r0 }- M' d7 I1 P
" p6 J# g* ~+ K- B& P$ F f/ Y1 O
& }$ F+ D# t2 E0 B/ Z; Z5 g. y& |# o/ g
- }. F1 B! T3 I p( l
* E& V0 h4 S/ O! Q8 P0 h2 mInspect成功之后:
& i6 y3 ]& n" s2 Z m* l; e. L% a2 Q0 [( f+ S( _
0 R) z* h7 P3 m! L4 k4 F
$ z( C7 b' r- E
# r$ N; _) l8 {8 Y) _% A! v4 M
$ c% J8 H+ P1 e. [5 \$ H+ ]0 v5 h) j& h% v$ Z* ?% m
7 U+ D- C9 g. Q% a A# I* ]1 q
1 l9 A! d- Q. ~3 A: E
+ x, L% g$ N$ b7 @! h( W2 `1 R+ p
# Q' f2 z# j, F+ `1 |上传Ironic使用的镜像:
8 o) A, o/ q) j& E5 ]# ^glance image-create --name CentOS-7-64bit-ironic.qcow2 --disk-format qcow2 --container-format bare --file CentOS-7-64bit-ironic.qcow2 --is-public True --human-readable —progress7 J4 \7 B& i& G, D, Q! a" t
glance image-update 40928b81-9be1-402a-8684-4e2d2fcf330f --property hypervisor_type=baremetal7 E. c& @ B$ i* r5 U
+ g P+ ?& c4 L/ g
nova boot --flavor 2 --image 40928b81-9be1-402a-8684-4e2d2fcf330f --nic net-id=3a151049-ff3f-4bc5-88a1-b9084ec24bc9 pc-zjnacthd01
6 U% A) n$ N; b+ K$ b$ ^$ p6 f( a+ H' R( \. a
. N/ o# h" b z' d
: ]9 [ o% W$ C6 D. b
6 Z) P( }+ B6 J4 e3 h* [5 U/ U4 }- P E& e% T- s
== 问题 ==3 r, l% @$ \( u( f
- node name有限制?+ F0 r: T c. i/ z& G7 @2 R+ `8 l
: p1 j( S0 i9 k [/ d- ?3 n! X# Q+ ~/ m
2 J- ^ f: H! D4 J( }0 C3 Z
8 J! L) [( D$ V8 f9 t
c. U9 f+ h" e0 @5 e$ h2 D; q- 第一次Inspect失败0 L9 s* P- S; m; @
2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main File "/usr/lib/python2.7/site-packages/keystoneauth1/access/service_catalog.py", line 228, in url_for
6 x9 o- L0 E2 {2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main raise exceptions.EndpointNotFound(msg)
4 k* }0 o- a. k7 @4 a2017-04-20 15:29:16.409 28596 ERROR ironic_inspector.main EndpointNotFound: public endpoint for baremetal service in RegionFour region not found
8 y7 g8 d; Z1 S5 G( Q
, t6 ^& f. {; F重启ironic服务后解决
6 [5 r: Z! y$ m& ~+ f5 N
5 J! Q4 o% n; ~6 H- 第二次inspect失败,BM拿不到IP
& x# ]8 w; w% j& @# @6 j1 j8 Y" m DHCP请求已经发送到dhcp server:. q6 @$ ?$ n$ K/ Y5 ?7 I+ Y/ y
$ ~7 p* x9 |! ?% } R1 M
* e9 F2 t0 E. `
4 e3 }9 _" {- L" k, K
# D: Y" q k/ n! V1 D4 u) o1 d+ t
- inspect时找不到cleaning_network
6 w! ~# O/ ? }3 r9 ~ 配置cleaning_network(=provide_network)# M+ e/ y3 t D0 \; }; M
( @ O- N% U M# N6 Y- nova boot失败, conductor.log:
" }# l1 f) j6 H+ f1 H# v) E9 h
6 M G, i& Y( V2 g
1 i( d7 P; V( R# R5 \: G- O! o8 A6 P M
0 g& |( _% B3 h' ~
更新控制节点的nova代码、ironic节点的ironic代码、计算节点ironicclient代码之后,问题解决
, l& c+ }, X2 k5 m0 {5 u) i: m7 L$ L) Q$ x' L- T3 u" Y/ N8 s' E
- nova boot失败,compute.log
8 l5 R% \* g; A
% [6 s3 o% n( `4 t' A% Y7 e9 X2 S5 e/ F8 R/ N' z! V- x
4 s9 X1 ?) _2 w9 r4 l! L
2 o! G6 d, k) Z1 {3 r) Z原因是这个ironic node driver_info还没更新:0 ?* k* @8 @+ U4 P7 E: I: R. h0 n
7 R' Y3 j- O! T" T6 T* D
0 X, D! C8 L2 C, |3 X1 E1 X
2 q7 c5 v- B: [ I3 q: i6 U6 z' z7 P9 c* e* _/ [! v
更新一下:5 j% B9 ?/ g- q" F2 ^
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=f8205536-070b-4286-8d0c-35e3b86477410 v( |' K0 I' a* G t. G: d! {
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=302e6438-4d31-429b-8bae-47e225d4ed67" J0 F5 d8 c+ x2 V* w
update 05/12: : X1 g6 Q. p4 C% ?: o4 N" \
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_kernel=4c1855e5-9b6b-47e2-89e5-3bc351c2ae2e( \0 j2 j6 I9 [, L" z2 g5 ]' u
ironic node-update baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 add driver_info/deploy_ramdisk=2f603c85-de92-44ea-b4d0-1396b91102cc
- \0 n3 {* ^+ j N& @, E
|8 c! }* l0 t+ _" w. @ L) d0 t/ w# w8 {. ^& C! a
& e: j: j; [( N: L& A& \9 B
' I) Y( S5 x8 P3 o" \
% X: u6 h! m0 J5 C) }- nova boot失败,镜像找不到,compute.log+ B! H7 B) {. @9 o2 Y
" l. H! r% N7 _1 V, M2 P7 ]1 q. k
& [. [: H3 |3 O/ S
- R) A6 P. U) J6 W
/ l: F3 W. @( r3 a( b计算节点nova.conf的glance-api配错了:3 k. @1 a% d+ N( T
; {8 m8 _ I4 P4 Y
$ Z; O8 x2 C; I) b6 a& j. @! o
: w3 S f E. |2 d* A2 P, ^
9 k/ ^' b% T2 Z. {9 p7 A- [ironic-conductor节点ironic.conf中添加glance api version=1& {; | y( T6 v+ v
; u8 m% g k. |2 ^% E9 \
( u2 B( i8 { v& m* ?/ D: `; Z7 W$ \6 T9 W R
3 ~' G" ?. z( X" |/ z q' p6 {! P9 r+ {1 R2 x( i
" v6 Z. Z. v4 O: v& t/ f7 \
glance_api_version=1
+ v3 l8 D9 G8 r0 f1 I9 Z. y& n" l3 k, J: W' C0 S3 |# M
- nova boot失败,ironic-conductor.log:* t' @. \; l% |: G4 t( c
3 ^2 A- E J- J K% x
2 @1 J5 } ]4 b1 A0 k' f
/ Y9 c. H, T$ }4 k/ f1 @ j+ Q& r
命令行验证,可以在provisioning network d5a284c3-41d3-4eb3-a11f-58a99d3e2eb1上创建port
/ n) z4 [: O; p6 d, [+ `" n1 I6 D, T& p
原因是没有enable LLDP。enable之后:
+ F% M, p; L8 b1 |/ C5 Y* t& ?' ~' u5 z
ironic port-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic port-delete 'X'+ d- {! O+ C# [5 ?6 y! a
ironic portgroup-list | awk '{print $2}' | egrep '[0-9]+' | xargs -I 'X' ironic portgroup-delete ‘X'
4 V+ D, e* ]1 Y) B J! Q' x重新Inspect:6 x# H0 A, t ^0 z% A
ironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 manage
! L9 I! c$ m1 W, aironic node-set-provision-state baa519fc-7c06-40f8-8e5a-5fd3b6e97e01 inspect7 @, B/ _# ]' w0 g
2 R6 C5 V5 O4 }% @7 ^- l% g( z3 b5 t4 g/ X
3 Z9 b' m# l2 G# z+ {$ k- t
7 ]* i- R5 C# F7 a! f0 y* ^5 h, D# s5 a
. } \' a' W, f
. t2 Q/ [& F1 {( i: E2 M( e) l* d/ [' U( C8 ?- j
% X5 S% e1 |6 ]: K) s
& H2 e& y1 c# _/ L& M; y+ A2 B3 n- nova boot失败,找不到用户镜像
+ A9 ?. r: S/ Y- M( a 原因是glance-registry.conf中的数据库写错了。
3 F9 e; S) x' G
5 F3 g5 g; {4 ?+ L1 q7 ~- nova boot失败,找不到ramdisk+ u6 G6 x$ D, c! g$ s& ]- j
& R: B: J f' j7 ?
3 a# v# B; R1 C
* e+ `6 A% F: ?3 l, W. q
$ Y! q L% g! q6 b, N/ G# r$ k
9 C) f4 v- ?! u# Y& e
这个image UUID是配置在ironic node的driver_info里面的,image需要上传到glance# y1 B, u* @6 ]+ @# k+ _$ I8 T
0 u1 c* r7 ^/ B上传镜像:
$ f0 R$ c4 l" d4 O9 K# J7 i
8 Z7 `! j6 O8 f% L# k# p) \1 s7 c2 \' k% @2 p- P
- O# w) ]& h. s" m4 f4 Q9 `- t& H1 P
2 o0 y, s% ?; {7 F2 B/ P/ _
; {1 N/ N* k! h- H! [
( J2 I% {& E4 y1 R* E1 S7 t' g+ |( s. X. X( a' f4 A
( Q7 D' U- b* \2 [3 g+ _/ ^1 V* f& ~7 T4 {1 ?
! Z& `( ^- z- |* S& _$ p& F! R
2 T8 V, p: Q, Z5 e+ q
P$ D) {5 ^$ a5 X6 o% Y
4 {0 o! p# W3 a: T3 [ a9 ?% u# L更新Ironic node信息:
1 u& L, R8 S. o8 V
/ H- R3 z7 o8 c1 \% o- V z3 X0 \* I1 j
/ z5 q6 w5 f$ m6 T: o6 i( d/ d# y6 V- O3 T# P. I
9 ]( D' ?7 ?% U
- nova boot失败,访问tftp权限不够
( W$ _6 H5 n% i: m & v9 S! K7 B q" X1 o5 Y
4 g4 }+ y1 N$ d2 r7 b8 @ j% a6 ~) p& @8 L* j
& d; c9 T7 k7 b ?- L
5 f' R6 B6 s5 ~/ c1 ?6 }
chown -R ironic:ironic /tftpboot/' p @4 ^0 f, V% q+ J) S
- E4 j0 h4 q% S1 P# e& }7 f7 z/ r1 D: I
( @' A: t! T1 R& g9 r
" J" j! y, q8 F3 e, r A Y: i* h& i2 N' R
1 O6 P* I% a( e, J! t+ I7 D
4 w4 u) i* ^7 D1 J8 x+ ~- nova boot失败,物理机DHCP请求被ironic-dhcp捕获了
( w7 O% A1 l; L: a: _' o- U7 M 关闭ironic-dhcp
1 V- l* M6 S0 y& o4 P
: M- H: R& G7 e4 | ^- nova boot失败,物理机DHCP时不能从neutron DHCP拿到IP$ _8 ]# v6 M8 C1 Q: B+ ?
在控制节点上,neutron dhcp在dnsmasq启动的namespace中。relay的目的地址是控制节点管理网IP(eno16777984),dnsmasq的监听设备为namespace的tap口,IP为20.26.34.91,他拿不到dhcp请求。
) F. K. F* M0 ?+ d现在的方法是:在控制节点上手动启动一个dnsmasq,使用neutron dhcp一样的配置) j5 ~3 P$ H" A
2 A9 r+ r" K O7 V' f0 n* o- 拿到IP之后,进入ramdisk系统,但是重启之后不能进入用户镜像的操作系统- [$ y& [' a; l; Z1 a/ e; d
查看BIOS的启动设备顺序,发现是- Boot Device Selector : No override
6 q. C7 u1 D6 w% ]1 \$ l查看ironic-conductor.log,发现连不上20.26.34.70:9999。这是IPA的地址和监听端口,需要保证ironic-conductor节点能连上,但是的确不通。
8 w" [6 M. N c0 Q" y1 f5 b
; N( V' Z6 d& u1 U/ x
! k; s- y9 k8 R$ B
: M" a" y- w" p# h+ P4 L7 A! U0 _0 S w1 N' E* D/ O* v2 m
, R3 X T4 X4 c# j' T: d( z姚军说可能是ramdisk启动之后,有两个网口获取到了IP地址,引起路由错乱,建议我们ramdisk启动之后,删除第二个地址。6 { H, t L/ `( \. {$ p$ Y
$ B0 u& N4 o; X/ u1 {4 K05/04 update: 在provisioning network上加上静态路由:destination=控制节点网段,nexthop为provisioning network GW; C! Q# z8 O! }/ p; B
05/11 update:neutron subnet-update aca03dd8-3d2a-4c54-99de-7a8a7bac4f53 --host-route destination=20.26.33.0/24,nexthop=20.26.34.18 t/ X/ I2 j) ~3 o: a% F* D0 z
Updated subnet: aca03dd8-3d2a-4c54-99de-7a8a7bac4f53
) Z8 ^1 N: b' }; W1 \1 z$ f4 R5 W. f' _* _
3 g+ ]2 j9 T0 J9 u+ t+ n+ C; M. ~
6 c( E, p4 ~: o: N* s0 O
! F5 z- w4 \! Z+ P" R' `/ Z) _4 {/ C; @/ G
验证可行,能连接这个端口并下载用户镜像: ==> 为啥会有多个网卡获取到IP,如何从代码层面解决?' {2 S: p# G, `
/ F6 g3 ~# O6 j8 w1 Z9 w& U" ^6 A/ Y& y& ?- S5 Y/ ?$ o
M# q# O% U5 L# Y5 p, U
+ l* k! ?" s; D6 M1 u( @7 m" i
* i5 E4 F1 g" H) f
# V* y3 O. q7 F# h/ M! n% kIPMI查询启动顺序:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootparam get 52 u- B, ~8 P: s1 N: ?& Z
设置硬盘启动:ipmitool -P Huawei12#$ -H 50.1.65.245 -U root -p 623 -I lanplus chassis bootdev disk3 O; v; q6 z+ ~6 s' |5 h
4 b. G2 X# M- ]) k" S. L! b/ f6 \* |
" J- B5 i2 e( _ U
4 H- Q% z" g: {7 r9 X* s6 C5 O6 R% l
- 用户镜像下载到了/dev/sdl,没有下载到第一个硬盘,并且整个boot过程超时了
1 A& W0 i5 J5 X & C" e: V6 S" w2 t$ T* w6 e
- K- c3 g6 N# W9 W* A, F( M7 K) C
0 L& `# Z; _2 w4 d
* k8 Q3 h8 D ~
a. 姚军修改了ramdisk,固定使用/dev/sda作为写入的硬盘
% y$ g5 F4 `: s% U N U b. 修改ironic.conf的deploy_callback_timeout=900' C4 t' h- B- K
, M, u5 ?+ Y0 c$ @; ^1 mUpdat 05/04:
: Y' A* X& O7 l+ V李灏:ironic node-update 4fae2ae3-0935-4585-8be2-00298015f8f3 replace properties/root_device='{"name": "/dev/sda"}'
9 [% `9 j& G. H2 y
[4 c# g/ R+ H. W6 J/ p- 写入了/dev/sda,但是ironic-conductor没有重启机器,导致boot hang死
* x% J( I- u4 y' e$ F6 x: j journalctl -fu python-ironic-agent查看IPA内的日志7 H! y& a5 m- |; p& f- P+ [8 j
journalctl --no-pager. a3 d6 C# i* d! F7 @
* U; E( f5 M. W. M5 D' K0 r- 镜像写入/dev/sda后,IPA执行partprobe /dev/sda失败
3 G d: `- B8 H9 w0 R% e- V( M
0 m; e9 s! ~- Z' ]3 C8 z9 \9 ~* j: s/ L" @) A
( p. s* ^# D0 \3 b
- {% \- P; B8 x1 j" s0 ]ramdisk中的ironic-lib需要打patch:https://review.openstack.org/#/c/444061/
) t7 a+ J' r) Y. k" J$ R
8 g6 U( j g6 Z s! Y5 }* E6 Z$ T
- 8 B% S* t6 H6 E7 X9 g! t) V* U0 t
|
|