You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cloudstack.apache.org by Yiping Zhang <yi...@adobe.com.INVALID> on 2019/06/04 00:41:11 UTC

Can't start systemVM in a new advanced zone deployment

Hi, list:

I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.

Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:


2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache

2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured

2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template

2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk

2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException

Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk



If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.

Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.

The full management server log is uploaded as https://pastebin.com/c05wiQ3R

Any help or insight on what went wrong here are much appreciated.

Thanks

Yiping

Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
Yes snapshots are supposed to be in PS template copy. 

On 6/6/19, 9:24 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    The nfs volume definitely allows root mount and have RW permissions, as we already see the volume mounted and template staged on primary storage. The volume is mounted as NFS3 datastore in vSphere.
    
    Volume snapshot is enabled,  I can ask to have snapshot disabled to see if it makes any differentces.   I need to find out more about NFS version and qtree mode from our storage admin.  
    
    One thing I noticed is that when cloudstack templates are staged on to primary storage, a snapshot was created, which does not exist In the original OVA or on secondary storage.  I suppose this is the expected behavior?
    
    Yiping
    
    On 6/6/19, 6:59 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    
        This option is 'vol options name_of_volume nosnapdir on' however if I recall it right is supposed to work even with .snapshot directory visible
        Can you find out all vol options on your netapp volume? I would be most concerned about:
        - NFS version - NFS v4 should be disabled
        - security qtree mode to be set to UNIX
        - allow root mount
        
        I am also wondering if ACS is able to create ROOT-XX folder so you might want to watch the content of the DS when ACS tries the operations.
         
        
        On 6/5/19, 11:43 PM, "Paul Angus" <pa...@shapeblue.com> wrote:
        
            Hi Yiping,
            
            do you have snapshots enabled on the NetApp filer?  (it used to be seen as a  ".snapshot"  subdirectory in each directory)
            
            If so try disabling snapshots - there used to be a bug where the .snapshot directory would confuse CloudStack.
            
            paul.angus@shapeblue.com 
            https://nam04.safelinks.protection.outlook.com/?url=www.shapeblue.com&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448727822&amp;sdata=NhoxwF0x4%2F8yn%2B8ck%2BCI8RUKEEDGnI73QfDDQeSmZUc%3D&amp;reserved=0
            Amadeus House, Floral Street, London  WC2E 9DPUK
            @shapeblue
              
             
            
            
            -----Original Message-----
            From: Yiping Zhang <yi...@adobe.com.INVALID> 
            Sent: 05 June 2019 23:38
            To: users@cloudstack.apache.org
            Subject: Re: Can't start systemVM in a new advanced zone deployment
            
            Hi, Sergey:
            
            I found more logs in vpxa.log ( the esxi hosts are using UTC time zone,  so I was looking at wrong time periods earlier).  I have uploaded more logs into pastebin.
            
            From these log entries,  it appears that when copying template to VM,  it tried to open destination VMDK file and got error file not found.  
            
            In case that the CloudStack attempted to create a systemVM,  the destination VMDK file path it is looking for is "<datastore>/<disk-name>/<disk-name>.vmdk",  see uploaded log at https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FaFysZkTy&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448727822&amp;sdata=YyB9VdghCgiBuUmDZ8gIc0jPlM8miPzemX2UEAZ3sFA%3D&amp;reserved=0
            
            In case when I manually created new VM from a (different) template in vCenter UI,   the destination VMDK file path it is looking for is "<datastore>/<VM-NAME>/<VM-NAME>.vmdk", see uploaded log at https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FyHcsD8xB&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=N%2BZHteGo3LDU0pvhBtzv7wcocAv35gRE9b9yKVQa6%2FQ%3D&amp;reserved=0
            
            So, I am confused as to how the path for destination VMDK was determined and by CloudStack or VMware, how did I end up with this?
            
            Yiping
            
            
            On 6/5/19, 12:32 PM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
            
                Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 
                
                
                On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                
                    Hi, Sergey:
                    
                    During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
                    
                    2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
                    2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
                    2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
                    
                    The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
                    
                    Yiping
                    
                    
                    On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                    
                        This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
                        
                        
                        On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                        
                            Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
                            
                            Yiping
                            
                            On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
                            
                                Yiping,
                                
                                if you are sure you can reproduce the issue, it would be good to raise a
                                GitHub issue and provide as much detail as possible.
                                
                                Andrija
                                
                                On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                                wrote:
                                
                                > Hi, Sergey:
                                >
                                > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                                > able to create and start system VM instances.    However,  I feel that the
                                > underlying problem still exists, and I am just working around it instead of
                                > fixing it,  because in my lab CloudStack instance with the same version of
                                > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                                > working as expected.
                                >
                                > I did some reading on VMware docs regarding full clone vs. linked clone.
                                > It seems that the best practice is to use full clone for production,
                                > especially if there are high rates of changes to the disks.  So
                                > eventually,  I need to understand and fix the root cause for this issue.
                                > At least for now,  I am over this hurdle and I can move on.
                                >
                                > Thanks again,
                                >
                                > Yiping
                                >
                                > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                                >
                                >     Everything looks good and consistent including all references in VMDK
                                > and its snapshot. I would try these 2 routes:
                                >     1. Figure out what vSphere error actually means from vmkernel log of
                                > ESX when ACS tries to clone the template. If the same error happens while
                                > doing it outside of ACS then a support case with VMware can be an option
                                >     2. Try using link clones. This can be done by this global setting and
                                > restarting management server
                                >     vmware.create.full.clone                    false
                                >
                                >
                                >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                                >
                                >         Hi, Sergey:
                                >
                                >         Thanks for the help. By now, I have dropped and recreated DB,
                                > re-deployed this zone multiple times, blow away primary and secondary
                                > storage (including all contents on them) , or just delete template itself
                                > from primary storage, multiple times.  Every time I ended up with the same
                                > error at the same place.
                                >
                                >         The full management server log,  from the point I seeded the
                                > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                                > the zone to let CS to create system VM's and finally disable the zone to
                                > stop infinite loop of trying to recreate failed system VM's,  are posted
                                > at pastebin:
                                >
                                >
                                > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=NY2nAe8v%2BP7ANlpjD4xcmOSc7IoBpFizoX3eCuclUHo%3D&amp;reserved=0
                                >
                                >         Here are the content of relevant files for the template on primary
                                > storage:
                                >
                                >         1) /vmfsvolumes:
                                >
                                >         ls -l /vmfs/volumes/
                                >         total 2052
                                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                                > 414f6a73-87cd6dac-9585-133ddd409762
                                >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                                > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                                > the NFS datastore for primary storage
                                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                                > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                                >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                                > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                                > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                                >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                                > afc5e946-03bfe3c2
                                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                                > b70c377c-54a9d28a-6a7b-3f462a475f73
                                >
                                >         2) content in template dir on primary storage:
                                >
                                >         ls -l
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                                >         total 1154596
                                >         -rw-------    1 root     root          8192 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                                >         -rw-------    1 root     root           366 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                                >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                                >         -rw-------    1 root     root          9711 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                                >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                                >         -rw-------    1 root     root           518 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                                >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                                >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                                > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                                >
                                >         3) *.vmdk file content:
                                >
                                >         cat
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                                >         # Disk DescriptorFile
                                >         version=1
                                >         encoding="UTF-8"
                                >         CID=ecb01275
                                >         parentCID=ffffffff
                                >         isNativeSnapshot="no"
                                >         createType="vmfs"
                                >
                                >         # Extent description
                                >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                                >
                                >         # The Disk Data Base
                                >         #DDB
                                >
                                >         ddb.adapterType = "lsilogic"
                                >         ddb.geometry.cylinders = "4063"
                                >         ddb.geometry.heads = "16"
                                >         ddb.geometry.sectors = "63"
                                >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                                >         ddb.thinProvisioned = "1"
                                >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                                >         ddb.virtualHWVersion = "13"
                                >
                                >         4) *-0001.vmdk content:
                                >
                                >         cat
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                                >
                                >         # Disk DescriptorFile
                                >         version=1
                                >         encoding="UTF-8"
                                >         CID=ecb01275
                                >         parentCID=ecb01275
                                >         isNativeSnapshot="no"
                                >         createType="vmfsSparse"
                                >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                                >         # Extent description
                                >         RW 4096000 VMFSSPARSE
                                > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                                >
                                >         # The Disk Data Base
                                >         #DDB
                                >
                                >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                                >
                                >
                                >         5) *.vmtx content:
                                >
                                >         cat
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                                >
                                >         .encoding = "UTF-8"
                                >         config.version = "8"
                                >         virtualHW.version = "8"
                                >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                                >         pciBridge0.present = "TRUE"
                                >         svga.present = "TRUE"
                                >         pciBridge4.present = "TRUE"
                                >         pciBridge4.virtualDev = "pcieRootPort"
                                >         pciBridge4.functions = "8"
                                >         pciBridge5.present = "TRUE"
                                >         pciBridge5.virtualDev = "pcieRootPort"
                                >         pciBridge5.functions = "8"
                                >         pciBridge6.present = "TRUE"
                                >         pciBridge6.virtualDev = "pcieRootPort"
                                >         pciBridge6.functions = "8"
                                >         pciBridge7.present = "TRUE"
                                >         pciBridge7.virtualDev = "pcieRootPort"
                                >         pciBridge7.functions = "8"
                                >         vmci0.present = "TRUE"
                                >         hpet0.present = "TRUE"
                                >         floppy0.present = "FALSE"
                                >         memSize = "256"
                                >         scsi0.virtualDev = "lsilogic"
                                >         scsi0.present = "TRUE"
                                >         ide0:0.startConnected = "FALSE"
                                >         ide0:0.deviceType = "atapi-cdrom"
                                >         ide0:0.fileName = "CD/DVD drive 0"
                                >         ide0:0.present = "TRUE"
                                >         scsi0:0.deviceType = "scsi-hardDisk"
                                >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                                >         scsi0:0.present = "TRUE"
                                >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                                >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                                >         guestOS = "otherlinux-64"
                                >         toolScripts.afterPowerOn = "TRUE"
                                >         toolScripts.afterResume = "TRUE"
                                >         toolScripts.beforeSuspend = "TRUE"
                                >         toolScripts.beforePowerOff = "TRUE"
                                >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                                >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                                >         firmware = "bios"
                                >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                                >
                                >
                                >         6) *.vmsd file content:
                                >
                                >         cat
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                                >         .encoding = "UTF-8"
                                >         snapshot.lastUID = "1"
                                >         snapshot.current = "1"
                                >         snapshot0.uid = "1"
                                >         snapshot0.filename =
                                > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                                >         snapshot0.displayName = "cloud.template.base"
                                >         snapshot0.description = "Base snapshot"
                                >         snapshot0.createTimeHigh = "363123"
                                >         snapshot0.createTimeLow = "-679076964"
                                >         snapshot0.numDisks = "1"
                                >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                                >         snapshot0.disk0.node = "scsi0:0"
                                >         snapshot.numSnapshots = "1"
                                >
                                >         7) *-Snapshot1.vmsn content:
                                >
                                >         cat
                                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                                >
                                >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                                >         config.version = "8"
                                >         virtualHW.version = "8"
                                >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                                >         pciBridge0.present = "TRUE"
                                >         svga.present = "TRUE"
                                >         pciBridge4.present = "TRUE"
                                >         pciBridge4.virtualDev = "pcieRootPort"
                                >         pciBridge4.functions = "8"
                                >         pciBridge5.present = "TRUE"
                                >         pciBridge5.virtualDev = "pcieRootPort"
                                >         pciBridge5.functions = "8"
                                >         pciBridge6.present = "TRUE"
                                >         pciBridge6.virtualDev = "pcieRootPort"
                                >         pciBridge6.functions = "8"
                                >         pciBridge7.present = "TRUE"
                                >         pciBridge7.virtualDev = "pcieRootPort"
                                >         pciBridge7.functions = "8"
                                >         vmci0.present = "TRUE"
                                >         hpet0.present = "TRUE"
                                >         floppy0.present = "FALSE"
                                >         memSize = "256"
                                >         scsi0.virtualDev = "lsilogic"
                                >         scsi0.present = "TRUE"
                                >         ide0:0.startConnected = "FALSE"
                                >         ide0:0.deviceType = "atapi-cdrom"
                                >         ide0:0.fileName = "CD/DVD drive 0"
                                >         ide0:0.present = "TRUE"
                                >         scsi0:0.deviceType = "scsi-hardDisk"
                                >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                                >         scsi0:0.present = "TRUE"
                                >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                                >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                                >         guestOS = "otherlinux-64"
                                >         toolScripts.afterPowerOn = "TRUE"
                                >         toolScripts.afterResume = "TRUE"
                                >         toolScripts.beforeSuspend = "TRUE"
                                >         toolScripts.beforePowerOff = "TRUE"
                                >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                                >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                                >         firmware = "bios"
                                >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                                >
                                >
                                >         ------------
                                >
                                >         That's all the data on the template VMDK.
                                >
                                >         Much appreciate your time!
                                >
                                >         Yiping
                                >
                                >
                                >
                                >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                                > wrote:
                                >
                                >             Have you tried deleting template from PS and let ACS to recopy
                                > it again? If the issue is reproducible we can try to look what is wrong
                                > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                                > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                                > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                                > copying template). Also from one of your ESX hosts output of this
                                >             ls -al /vmfs/volumes
                                >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                                > equitant after ACS finishes copying template)
                                >
                                >              Can you also post management server log starting from the
                                > point you unregister and delete template from the vCenter.
                                >
                                >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                                > wrote:
                                >
                                >                 I have manually imported the OVA to vCenter and
                                > successfully cloned a VM instance with it, on the same NFS datastore.
                                >
                                >
                                >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                                > serg38l@hotmail.com> wrote:
                                >
                                >                     I would suspect the template is corrupted on the
                                > secondary storage. You can try disabling/enabling link clone feature and
                                > see if it works the other way.
                                >                     vmware.create.full.clone                    false
                                >
                                >                     Also systemVM template might have been generated on a
                                > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                                > to validate this is to manually deploy OVA that is in Secondary storage and
                                > try to spin up VM from it directly in vCenter.
                                >
                                >
                                >
                                >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                                > <yi...@adobe.com.INVALID> wrote:
                                >
                                >                         Hi, list:
                                >
                                >                         I am struggling with deploying a new advanced zone
                                > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                                > storage devices. The initial setup of CS management server, seeding of
                                > systemVM template, and advanced zone deployment all went smoothly.
                                >
                                >                         Once I enabled the zone in web UI and the systemVM
                                > template gets copied/staged on to primary storage device. But subsequent VM
                                > creations from this template would fail with errors:
                                >
                                >
                                >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                                > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                                > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                                > cache
                                >
                                >                         2019-06-03 18:38:17,017 INFO
                                > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                                > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                                > successfully reconfigured
                                >
                                >                         2019-06-03 18:38:17,128 INFO
                                > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                                > clone from template
                                >
                                >                         2019-06-03 18:38:17,657 INFO
                                > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                                > [ignored]failed toi get message for exception: Error caused by file
                                > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                                >
                                >                         2019-06-03 18:38:17,658 ERROR
                                > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                                > from base image failed due to Exception: java.lang.RuntimeException
                                >
                                >                         Message: Error caused by file
                                > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                                >
                                >
                                >
                                >                         If I try to create “new VM from template”
                                > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                                > exactly the same error message. The name of the VMDK file in the error
                                > message is a snapshot of the base disk image, but it is not part of the
                                > original template OVA on the secondary storage.  So, in the process of
                                > copying the template from secondary to primary storage, a snapshot got
                                > created and the disk became corrupted/unusable.
                                >
                                >                         Much later in the log file,  there is another
                                > error message “failed to fetch any free public IP address” (for ssvm, I
                                > think).  I don’t know if these two errors are related or if one is the root
                                > cause for the other error.
                                >
                                >                         The full management server log is uploaded as
                                > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=NY2nAe8v%2BP7ANlpjD4xcmOSc7IoBpFizoX3eCuclUHo%3D&amp;reserved=0
                                >
                                >                         Any help or insight on what went wrong here are
                                > much appreciated.
                                >
                                >                         Thanks
                                >
                                >                         Yiping
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                >
                                
                                -- 
                                
                                Andrija Panić
                                
                            
                            
                        
                        
                    
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
The nfs volume definitely allows root mount and have RW permissions, as we already see the volume mounted and template staged on primary storage. The volume is mounted as NFS3 datastore in vSphere.

Volume snapshot is enabled,  I can ask to have snapshot disabled to see if it makes any differentces.   I need to find out more about NFS version and qtree mode from our storage admin.  

One thing I noticed is that when cloudstack templates are staged on to primary storage, a snapshot was created, which does not exist In the original OVA or on secondary storage.  I suppose this is the expected behavior?

Yiping

On 6/6/19, 6:59 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    This option is 'vol options name_of_volume nosnapdir on' however if I recall it right is supposed to work even with .snapshot directory visible
    Can you find out all vol options on your netapp volume? I would be most concerned about:
    - NFS version - NFS v4 should be disabled
    - security qtree mode to be set to UNIX
    - allow root mount
    
    I am also wondering if ACS is able to create ROOT-XX folder so you might want to watch the content of the DS when ACS tries the operations.
     
    
    On 6/5/19, 11:43 PM, "Paul Angus" <pa...@shapeblue.com> wrote:
    
        Hi Yiping,
        
        do you have snapshots enabled on the NetApp filer?  (it used to be seen as a  ".snapshot"  subdirectory in each directory)
        
        If so try disabling snapshots - there used to be a bug where the .snapshot directory would confuse CloudStack.
        
        paul.angus@shapeblue.com 
        https://nam04.safelinks.protection.outlook.com/?url=www.shapeblue.com&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448727822&amp;sdata=NhoxwF0x4%2F8yn%2B8ck%2BCI8RUKEEDGnI73QfDDQeSmZUc%3D&amp;reserved=0
        Amadeus House, Floral Street, London  WC2E 9DPUK
        @shapeblue
          
         
        
        
        -----Original Message-----
        From: Yiping Zhang <yi...@adobe.com.INVALID> 
        Sent: 05 June 2019 23:38
        To: users@cloudstack.apache.org
        Subject: Re: Can't start systemVM in a new advanced zone deployment
        
        Hi, Sergey:
        
        I found more logs in vpxa.log ( the esxi hosts are using UTC time zone,  so I was looking at wrong time periods earlier).  I have uploaded more logs into pastebin.
        
        From these log entries,  it appears that when copying template to VM,  it tried to open destination VMDK file and got error file not found.  
        
        In case that the CloudStack attempted to create a systemVM,  the destination VMDK file path it is looking for is "<datastore>/<disk-name>/<disk-name>.vmdk",  see uploaded log at https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FaFysZkTy&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448727822&amp;sdata=YyB9VdghCgiBuUmDZ8gIc0jPlM8miPzemX2UEAZ3sFA%3D&amp;reserved=0
        
        In case when I manually created new VM from a (different) template in vCenter UI,   the destination VMDK file path it is looking for is "<datastore>/<VM-NAME>/<VM-NAME>.vmdk", see uploaded log at https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FyHcsD8xB&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=N%2BZHteGo3LDU0pvhBtzv7wcocAv35gRE9b9yKVQa6%2FQ%3D&amp;reserved=0
        
        So, I am confused as to how the path for destination VMDK was determined and by CloudStack or VMware, how did I end up with this?
        
        Yiping
        
        
        On 6/5/19, 12:32 PM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        
            Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 
            
            
            On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            
                Hi, Sergey:
                
                During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
                
                2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
                2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
                2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
                
                The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
                
                Yiping
                
                
                On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                
                    This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
                    
                    
                    On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                    
                        Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
                        
                        Yiping
                        
                        On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
                        
                            Yiping,
                            
                            if you are sure you can reproduce the issue, it would be good to raise a
                            GitHub issue and provide as much detail as possible.
                            
                            Andrija
                            
                            On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                            wrote:
                            
                            > Hi, Sergey:
                            >
                            > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                            > able to create and start system VM instances.    However,  I feel that the
                            > underlying problem still exists, and I am just working around it instead of
                            > fixing it,  because in my lab CloudStack instance with the same version of
                            > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                            > working as expected.
                            >
                            > I did some reading on VMware docs regarding full clone vs. linked clone.
                            > It seems that the best practice is to use full clone for production,
                            > especially if there are high rates of changes to the disks.  So
                            > eventually,  I need to understand and fix the root cause for this issue.
                            > At least for now,  I am over this hurdle and I can move on.
                            >
                            > Thanks again,
                            >
                            > Yiping
                            >
                            > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                            >
                            >     Everything looks good and consistent including all references in VMDK
                            > and its snapshot. I would try these 2 routes:
                            >     1. Figure out what vSphere error actually means from vmkernel log of
                            > ESX when ACS tries to clone the template. If the same error happens while
                            > doing it outside of ACS then a support case with VMware can be an option
                            >     2. Try using link clones. This can be done by this global setting and
                            > restarting management server
                            >     vmware.create.full.clone                    false
                            >
                            >
                            >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                            >
                            >         Hi, Sergey:
                            >
                            >         Thanks for the help. By now, I have dropped and recreated DB,
                            > re-deployed this zone multiple times, blow away primary and secondary
                            > storage (including all contents on them) , or just delete template itself
                            > from primary storage, multiple times.  Every time I ended up with the same
                            > error at the same place.
                            >
                            >         The full management server log,  from the point I seeded the
                            > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                            > the zone to let CS to create system VM's and finally disable the zone to
                            > stop infinite loop of trying to recreate failed system VM's,  are posted
                            > at pastebin:
                            >
                            >
                            > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=NY2nAe8v%2BP7ANlpjD4xcmOSc7IoBpFizoX3eCuclUHo%3D&amp;reserved=0
                            >
                            >         Here are the content of relevant files for the template on primary
                            > storage:
                            >
                            >         1) /vmfsvolumes:
                            >
                            >         ls -l /vmfs/volumes/
                            >         total 2052
                            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                            > 414f6a73-87cd6dac-9585-133ddd409762
                            >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                            > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                            > the NFS datastore for primary storage
                            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                            > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                            >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                            > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                            > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                            >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                            > afc5e946-03bfe3c2
                            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                            > b70c377c-54a9d28a-6a7b-3f462a475f73
                            >
                            >         2) content in template dir on primary storage:
                            >
                            >         ls -l
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                            >         total 1154596
                            >         -rw-------    1 root     root          8192 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                            >         -rw-------    1 root     root           366 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                            >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                            >         -rw-------    1 root     root          9711 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                            >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                            >         -rw-------    1 root     root           518 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                            >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                            >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                            > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                            >
                            >         3) *.vmdk file content:
                            >
                            >         cat
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                            >         # Disk DescriptorFile
                            >         version=1
                            >         encoding="UTF-8"
                            >         CID=ecb01275
                            >         parentCID=ffffffff
                            >         isNativeSnapshot="no"
                            >         createType="vmfs"
                            >
                            >         # Extent description
                            >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                            >
                            >         # The Disk Data Base
                            >         #DDB
                            >
                            >         ddb.adapterType = "lsilogic"
                            >         ddb.geometry.cylinders = "4063"
                            >         ddb.geometry.heads = "16"
                            >         ddb.geometry.sectors = "63"
                            >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                            >         ddb.thinProvisioned = "1"
                            >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                            >         ddb.virtualHWVersion = "13"
                            >
                            >         4) *-0001.vmdk content:
                            >
                            >         cat
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                            >
                            >         # Disk DescriptorFile
                            >         version=1
                            >         encoding="UTF-8"
                            >         CID=ecb01275
                            >         parentCID=ecb01275
                            >         isNativeSnapshot="no"
                            >         createType="vmfsSparse"
                            >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                            >         # Extent description
                            >         RW 4096000 VMFSSPARSE
                            > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                            >
                            >         # The Disk Data Base
                            >         #DDB
                            >
                            >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                            >
                            >
                            >         5) *.vmtx content:
                            >
                            >         cat
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                            >
                            >         .encoding = "UTF-8"
                            >         config.version = "8"
                            >         virtualHW.version = "8"
                            >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                            >         pciBridge0.present = "TRUE"
                            >         svga.present = "TRUE"
                            >         pciBridge4.present = "TRUE"
                            >         pciBridge4.virtualDev = "pcieRootPort"
                            >         pciBridge4.functions = "8"
                            >         pciBridge5.present = "TRUE"
                            >         pciBridge5.virtualDev = "pcieRootPort"
                            >         pciBridge5.functions = "8"
                            >         pciBridge6.present = "TRUE"
                            >         pciBridge6.virtualDev = "pcieRootPort"
                            >         pciBridge6.functions = "8"
                            >         pciBridge7.present = "TRUE"
                            >         pciBridge7.virtualDev = "pcieRootPort"
                            >         pciBridge7.functions = "8"
                            >         vmci0.present = "TRUE"
                            >         hpet0.present = "TRUE"
                            >         floppy0.present = "FALSE"
                            >         memSize = "256"
                            >         scsi0.virtualDev = "lsilogic"
                            >         scsi0.present = "TRUE"
                            >         ide0:0.startConnected = "FALSE"
                            >         ide0:0.deviceType = "atapi-cdrom"
                            >         ide0:0.fileName = "CD/DVD drive 0"
                            >         ide0:0.present = "TRUE"
                            >         scsi0:0.deviceType = "scsi-hardDisk"
                            >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                            >         scsi0:0.present = "TRUE"
                            >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                            >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                            >         guestOS = "otherlinux-64"
                            >         toolScripts.afterPowerOn = "TRUE"
                            >         toolScripts.afterResume = "TRUE"
                            >         toolScripts.beforeSuspend = "TRUE"
                            >         toolScripts.beforePowerOff = "TRUE"
                            >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                            >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                            >         firmware = "bios"
                            >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                            >
                            >
                            >         6) *.vmsd file content:
                            >
                            >         cat
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                            >         .encoding = "UTF-8"
                            >         snapshot.lastUID = "1"
                            >         snapshot.current = "1"
                            >         snapshot0.uid = "1"
                            >         snapshot0.filename =
                            > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                            >         snapshot0.displayName = "cloud.template.base"
                            >         snapshot0.description = "Base snapshot"
                            >         snapshot0.createTimeHigh = "363123"
                            >         snapshot0.createTimeLow = "-679076964"
                            >         snapshot0.numDisks = "1"
                            >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                            >         snapshot0.disk0.node = "scsi0:0"
                            >         snapshot.numSnapshots = "1"
                            >
                            >         7) *-Snapshot1.vmsn content:
                            >
                            >         cat
                            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                            >
                            >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                            >         config.version = "8"
                            >         virtualHW.version = "8"
                            >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                            >         pciBridge0.present = "TRUE"
                            >         svga.present = "TRUE"
                            >         pciBridge4.present = "TRUE"
                            >         pciBridge4.virtualDev = "pcieRootPort"
                            >         pciBridge4.functions = "8"
                            >         pciBridge5.present = "TRUE"
                            >         pciBridge5.virtualDev = "pcieRootPort"
                            >         pciBridge5.functions = "8"
                            >         pciBridge6.present = "TRUE"
                            >         pciBridge6.virtualDev = "pcieRootPort"
                            >         pciBridge6.functions = "8"
                            >         pciBridge7.present = "TRUE"
                            >         pciBridge7.virtualDev = "pcieRootPort"
                            >         pciBridge7.functions = "8"
                            >         vmci0.present = "TRUE"
                            >         hpet0.present = "TRUE"
                            >         floppy0.present = "FALSE"
                            >         memSize = "256"
                            >         scsi0.virtualDev = "lsilogic"
                            >         scsi0.present = "TRUE"
                            >         ide0:0.startConnected = "FALSE"
                            >         ide0:0.deviceType = "atapi-cdrom"
                            >         ide0:0.fileName = "CD/DVD drive 0"
                            >         ide0:0.present = "TRUE"
                            >         scsi0:0.deviceType = "scsi-hardDisk"
                            >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                            >         scsi0:0.present = "TRUE"
                            >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                            >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                            >         guestOS = "otherlinux-64"
                            >         toolScripts.afterPowerOn = "TRUE"
                            >         toolScripts.afterResume = "TRUE"
                            >         toolScripts.beforeSuspend = "TRUE"
                            >         toolScripts.beforePowerOff = "TRUE"
                            >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                            >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                            >         firmware = "bios"
                            >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                            >
                            >
                            >         ------------
                            >
                            >         That's all the data on the template VMDK.
                            >
                            >         Much appreciate your time!
                            >
                            >         Yiping
                            >
                            >
                            >
                            >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                            > wrote:
                            >
                            >             Have you tried deleting template from PS and let ACS to recopy
                            > it again? If the issue is reproducible we can try to look what is wrong
                            > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                            > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                            > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                            > copying template). Also from one of your ESX hosts output of this
                            >             ls -al /vmfs/volumes
                            >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                            > equitant after ACS finishes copying template)
                            >
                            >              Can you also post management server log starting from the
                            > point you unregister and delete template from the vCenter.
                            >
                            >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                            > wrote:
                            >
                            >                 I have manually imported the OVA to vCenter and
                            > successfully cloned a VM instance with it, on the same NFS datastore.
                            >
                            >
                            >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                            > serg38l@hotmail.com> wrote:
                            >
                            >                     I would suspect the template is corrupted on the
                            > secondary storage. You can try disabling/enabling link clone feature and
                            > see if it works the other way.
                            >                     vmware.create.full.clone                    false
                            >
                            >                     Also systemVM template might have been generated on a
                            > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                            > to validate this is to manually deploy OVA that is in Secondary storage and
                            > try to spin up VM from it directly in vCenter.
                            >
                            >
                            >
                            >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                            > <yi...@adobe.com.INVALID> wrote:
                            >
                            >                         Hi, list:
                            >
                            >                         I am struggling with deploying a new advanced zone
                            > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                            > storage devices. The initial setup of CS management server, seeding of
                            > systemVM template, and advanced zone deployment all went smoothly.
                            >
                            >                         Once I enabled the zone in web UI and the systemVM
                            > template gets copied/staged on to primary storage device. But subsequent VM
                            > creations from this template would fail with errors:
                            >
                            >
                            >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                            > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                            > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                            > cache
                            >
                            >                         2019-06-03 18:38:17,017 INFO
                            > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                            > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                            > successfully reconfigured
                            >
                            >                         2019-06-03 18:38:17,128 INFO
                            > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                            > clone from template
                            >
                            >                         2019-06-03 18:38:17,657 INFO
                            > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                            > [ignored]failed toi get message for exception: Error caused by file
                            > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                            >
                            >                         2019-06-03 18:38:17,658 ERROR
                            > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                            > from base image failed due to Exception: java.lang.RuntimeException
                            >
                            >                         Message: Error caused by file
                            > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                            >
                            >
                            >
                            >                         If I try to create “new VM from template”
                            > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                            > exactly the same error message. The name of the VMDK file in the error
                            > message is a snapshot of the base disk image, but it is not part of the
                            > original template OVA on the secondary storage.  So, in the process of
                            > copying the template from secondary to primary storage, a snapshot got
                            > created and the disk became corrupted/unusable.
                            >
                            >                         Much later in the log file,  there is another
                            > error message “failed to fetch any free public IP address” (for ssvm, I
                            > think).  I don’t know if these two errors are related or if one is the root
                            > cause for the other error.
                            >
                            >                         The full management server log is uploaded as
                            > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C557bf647ff13413c66b708d6ea87220d%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636954263448732817&amp;sdata=NY2nAe8v%2BP7ANlpjD4xcmOSc7IoBpFizoX3eCuclUHo%3D&amp;reserved=0
                            >
                            >                         Any help or insight on what went wrong here are
                            > much appreciated.
                            >
                            >                         Thanks
                            >
                            >                         Yiping
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            >
                            
                            -- 
                            
                            Andrija Panić
                            
                        
                        
                    
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
This option is 'vol options name_of_volume nosnapdir on' however if I recall it right is supposed to work even with .snapshot directory visible
Can you find out all vol options on your netapp volume? I would be most concerned about:
- NFS version - NFS v4 should be disabled
- security qtree mode to be set to UNIX
- allow root mount

I am also wondering if ACS is able to create ROOT-XX folder so you might want to watch the content of the DS when ACS tries the operations.
 

On 6/5/19, 11:43 PM, "Paul Angus" <pa...@shapeblue.com> wrote:

    Hi Yiping,
    
    do you have snapshots enabled on the NetApp filer?  (it used to be seen as a  ".snapshot"  subdirectory in each directory)
    
    If so try disabling snapshots - there used to be a bug where the .snapshot directory would confuse CloudStack.
    
    paul.angus@shapeblue.com 
    www.shapeblue.com
    Amadeus House, Floral Street, London  WC2E 9DPUK
    @shapeblue
      
     
    
    
    -----Original Message-----
    From: Yiping Zhang <yi...@adobe.com.INVALID> 
    Sent: 05 June 2019 23:38
    To: users@cloudstack.apache.org
    Subject: Re: Can't start systemVM in a new advanced zone deployment
    
    Hi, Sergey:
    
    I found more logs in vpxa.log ( the esxi hosts are using UTC time zone,  so I was looking at wrong time periods earlier).  I have uploaded more logs into pastebin.
    
    From these log entries,  it appears that when copying template to VM,  it tried to open destination VMDK file and got error file not found.  
    
    In case that the CloudStack attempted to create a systemVM,  the destination VMDK file path it is looking for is "<datastore>/<disk-name>/<disk-name>.vmdk",  see uploaded log at https://pastebin.com/aFysZkTy
    
    In case when I manually created new VM from a (different) template in vCenter UI,   the destination VMDK file path it is looking for is "<datastore>/<VM-NAME>/<VM-NAME>.vmdk", see uploaded log at https://pastebin.com/yHcsD8xB
    
    So, I am confused as to how the path for destination VMDK was determined and by CloudStack or VMware, how did I end up with this?
    
    Yiping
    
    
    On 6/5/19, 12:32 PM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    
        Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 
        
        
        On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
        
            Hi, Sergey:
            
            During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
            
            2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
            2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
            2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
            
            The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
            
            Yiping
            
            
            On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
            
                This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
                
                
                On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                
                    Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
                    
                    Yiping
                    
                    On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
                    
                        Yiping,
                        
                        if you are sure you can reproduce the issue, it would be good to raise a
                        GitHub issue and provide as much detail as possible.
                        
                        Andrija
                        
                        On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                        wrote:
                        
                        > Hi, Sergey:
                        >
                        > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                        > able to create and start system VM instances.    However,  I feel that the
                        > underlying problem still exists, and I am just working around it instead of
                        > fixing it,  because in my lab CloudStack instance with the same version of
                        > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                        > working as expected.
                        >
                        > I did some reading on VMware docs regarding full clone vs. linked clone.
                        > It seems that the best practice is to use full clone for production,
                        > especially if there are high rates of changes to the disks.  So
                        > eventually,  I need to understand and fix the root cause for this issue.
                        > At least for now,  I am over this hurdle and I can move on.
                        >
                        > Thanks again,
                        >
                        > Yiping
                        >
                        > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                        >
                        >     Everything looks good and consistent including all references in VMDK
                        > and its snapshot. I would try these 2 routes:
                        >     1. Figure out what vSphere error actually means from vmkernel log of
                        > ESX when ACS tries to clone the template. If the same error happens while
                        > doing it outside of ACS then a support case with VMware can be an option
                        >     2. Try using link clones. This can be done by this global setting and
                        > restarting management server
                        >     vmware.create.full.clone                    false
                        >
                        >
                        >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                        >
                        >         Hi, Sergey:
                        >
                        >         Thanks for the help. By now, I have dropped and recreated DB,
                        > re-deployed this zone multiple times, blow away primary and secondary
                        > storage (including all contents on them) , or just delete template itself
                        > from primary storage, multiple times.  Every time I ended up with the same
                        > error at the same place.
                        >
                        >         The full management server log,  from the point I seeded the
                        > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                        > the zone to let CS to create system VM's and finally disable the zone to
                        > stop infinite loop of trying to recreate failed system VM's,  are posted
                        > at pastebin:
                        >
                        >
                        > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                        >
                        >         Here are the content of relevant files for the template on primary
                        > storage:
                        >
                        >         1) /vmfsvolumes:
                        >
                        >         ls -l /vmfs/volumes/
                        >         total 2052
                        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                        > 414f6a73-87cd6dac-9585-133ddd409762
                        >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                        > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                        > the NFS datastore for primary storage
                        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                        > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                        >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                        > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                        > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                        >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                        > afc5e946-03bfe3c2
                        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                        > b70c377c-54a9d28a-6a7b-3f462a475f73
                        >
                        >         2) content in template dir on primary storage:
                        >
                        >         ls -l
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                        >         total 1154596
                        >         -rw-------    1 root     root          8192 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                        >         -rw-------    1 root     root           366 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                        >         -rw-------    1 root     root          9711 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                        >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                        >         -rw-------    1 root     root           518 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                        >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                        >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                        > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                        >
                        >         3) *.vmdk file content:
                        >
                        >         cat
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                        >         # Disk DescriptorFile
                        >         version=1
                        >         encoding="UTF-8"
                        >         CID=ecb01275
                        >         parentCID=ffffffff
                        >         isNativeSnapshot="no"
                        >         createType="vmfs"
                        >
                        >         # Extent description
                        >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                        >
                        >         # The Disk Data Base
                        >         #DDB
                        >
                        >         ddb.adapterType = "lsilogic"
                        >         ddb.geometry.cylinders = "4063"
                        >         ddb.geometry.heads = "16"
                        >         ddb.geometry.sectors = "63"
                        >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                        >         ddb.thinProvisioned = "1"
                        >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                        >         ddb.virtualHWVersion = "13"
                        >
                        >         4) *-0001.vmdk content:
                        >
                        >         cat
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        >
                        >         # Disk DescriptorFile
                        >         version=1
                        >         encoding="UTF-8"
                        >         CID=ecb01275
                        >         parentCID=ecb01275
                        >         isNativeSnapshot="no"
                        >         createType="vmfsSparse"
                        >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                        >         # Extent description
                        >         RW 4096000 VMFSSPARSE
                        > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                        >
                        >         # The Disk Data Base
                        >         #DDB
                        >
                        >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                        >
                        >
                        >         5) *.vmtx content:
                        >
                        >         cat
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                        >
                        >         .encoding = "UTF-8"
                        >         config.version = "8"
                        >         virtualHW.version = "8"
                        >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                        >         pciBridge0.present = "TRUE"
                        >         svga.present = "TRUE"
                        >         pciBridge4.present = "TRUE"
                        >         pciBridge4.virtualDev = "pcieRootPort"
                        >         pciBridge4.functions = "8"
                        >         pciBridge5.present = "TRUE"
                        >         pciBridge5.virtualDev = "pcieRootPort"
                        >         pciBridge5.functions = "8"
                        >         pciBridge6.present = "TRUE"
                        >         pciBridge6.virtualDev = "pcieRootPort"
                        >         pciBridge6.functions = "8"
                        >         pciBridge7.present = "TRUE"
                        >         pciBridge7.virtualDev = "pcieRootPort"
                        >         pciBridge7.functions = "8"
                        >         vmci0.present = "TRUE"
                        >         hpet0.present = "TRUE"
                        >         floppy0.present = "FALSE"
                        >         memSize = "256"
                        >         scsi0.virtualDev = "lsilogic"
                        >         scsi0.present = "TRUE"
                        >         ide0:0.startConnected = "FALSE"
                        >         ide0:0.deviceType = "atapi-cdrom"
                        >         ide0:0.fileName = "CD/DVD drive 0"
                        >         ide0:0.present = "TRUE"
                        >         scsi0:0.deviceType = "scsi-hardDisk"
                        >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                        >         scsi0:0.present = "TRUE"
                        >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                        >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                        >         guestOS = "otherlinux-64"
                        >         toolScripts.afterPowerOn = "TRUE"
                        >         toolScripts.afterResume = "TRUE"
                        >         toolScripts.beforeSuspend = "TRUE"
                        >         toolScripts.beforePowerOff = "TRUE"
                        >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                        >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                        >         firmware = "bios"
                        >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                        >
                        >
                        >         6) *.vmsd file content:
                        >
                        >         cat
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                        >         .encoding = "UTF-8"
                        >         snapshot.lastUID = "1"
                        >         snapshot.current = "1"
                        >         snapshot0.uid = "1"
                        >         snapshot0.filename =
                        > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                        >         snapshot0.displayName = "cloud.template.base"
                        >         snapshot0.description = "Base snapshot"
                        >         snapshot0.createTimeHigh = "363123"
                        >         snapshot0.createTimeLow = "-679076964"
                        >         snapshot0.numDisks = "1"
                        >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                        >         snapshot0.disk0.node = "scsi0:0"
                        >         snapshot.numSnapshots = "1"
                        >
                        >         7) *-Snapshot1.vmsn content:
                        >
                        >         cat
                        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                        >
                        >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                        >         config.version = "8"
                        >         virtualHW.version = "8"
                        >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                        >         pciBridge0.present = "TRUE"
                        >         svga.present = "TRUE"
                        >         pciBridge4.present = "TRUE"
                        >         pciBridge4.virtualDev = "pcieRootPort"
                        >         pciBridge4.functions = "8"
                        >         pciBridge5.present = "TRUE"
                        >         pciBridge5.virtualDev = "pcieRootPort"
                        >         pciBridge5.functions = "8"
                        >         pciBridge6.present = "TRUE"
                        >         pciBridge6.virtualDev = "pcieRootPort"
                        >         pciBridge6.functions = "8"
                        >         pciBridge7.present = "TRUE"
                        >         pciBridge7.virtualDev = "pcieRootPort"
                        >         pciBridge7.functions = "8"
                        >         vmci0.present = "TRUE"
                        >         hpet0.present = "TRUE"
                        >         floppy0.present = "FALSE"
                        >         memSize = "256"
                        >         scsi0.virtualDev = "lsilogic"
                        >         scsi0.present = "TRUE"
                        >         ide0:0.startConnected = "FALSE"
                        >         ide0:0.deviceType = "atapi-cdrom"
                        >         ide0:0.fileName = "CD/DVD drive 0"
                        >         ide0:0.present = "TRUE"
                        >         scsi0:0.deviceType = "scsi-hardDisk"
                        >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                        >         scsi0:0.present = "TRUE"
                        >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                        >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                        >         guestOS = "otherlinux-64"
                        >         toolScripts.afterPowerOn = "TRUE"
                        >         toolScripts.afterResume = "TRUE"
                        >         toolScripts.beforeSuspend = "TRUE"
                        >         toolScripts.beforePowerOff = "TRUE"
                        >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                        >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                        >         firmware = "bios"
                        >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                        >
                        >
                        >         ------------
                        >
                        >         That's all the data on the template VMDK.
                        >
                        >         Much appreciate your time!
                        >
                        >         Yiping
                        >
                        >
                        >
                        >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                        > wrote:
                        >
                        >             Have you tried deleting template from PS and let ACS to recopy
                        > it again? If the issue is reproducible we can try to look what is wrong
                        > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                        > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                        > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                        > copying template). Also from one of your ESX hosts output of this
                        >             ls -al /vmfs/volumes
                        >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                        > equitant after ACS finishes copying template)
                        >
                        >              Can you also post management server log starting from the
                        > point you unregister and delete template from the vCenter.
                        >
                        >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                        > wrote:
                        >
                        >                 I have manually imported the OVA to vCenter and
                        > successfully cloned a VM instance with it, on the same NFS datastore.
                        >
                        >
                        >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                        > serg38l@hotmail.com> wrote:
                        >
                        >                     I would suspect the template is corrupted on the
                        > secondary storage. You can try disabling/enabling link clone feature and
                        > see if it works the other way.
                        >                     vmware.create.full.clone                    false
                        >
                        >                     Also systemVM template might have been generated on a
                        > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                        > to validate this is to manually deploy OVA that is in Secondary storage and
                        > try to spin up VM from it directly in vCenter.
                        >
                        >
                        >
                        >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                        > <yi...@adobe.com.INVALID> wrote:
                        >
                        >                         Hi, list:
                        >
                        >                         I am struggling with deploying a new advanced zone
                        > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                        > storage devices. The initial setup of CS management server, seeding of
                        > systemVM template, and advanced zone deployment all went smoothly.
                        >
                        >                         Once I enabled the zone in web UI and the systemVM
                        > template gets copied/staged on to primary storage device. But subsequent VM
                        > creations from this template would fail with errors:
                        >
                        >
                        >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                        > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                        > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                        > cache
                        >
                        >                         2019-06-03 18:38:17,017 INFO
                        > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                        > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                        > successfully reconfigured
                        >
                        >                         2019-06-03 18:38:17,128 INFO
                        > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                        > clone from template
                        >
                        >                         2019-06-03 18:38:17,657 INFO
                        > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                        > [ignored]failed toi get message for exception: Error caused by file
                        > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        >
                        >                         2019-06-03 18:38:17,658 ERROR
                        > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                        > from base image failed due to Exception: java.lang.RuntimeException
                        >
                        >                         Message: Error caused by file
                        > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        >
                        >
                        >
                        >                         If I try to create “new VM from template”
                        > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                        > exactly the same error message. The name of the VMDK file in the error
                        > message is a snapshot of the base disk image, but it is not part of the
                        > original template OVA on the secondary storage.  So, in the process of
                        > copying the template from secondary to primary storage, a snapshot got
                        > created and the disk became corrupted/unusable.
                        >
                        >                         Much later in the log file,  there is another
                        > error message “failed to fetch any free public IP address” (for ssvm, I
                        > think).  I don’t know if these two errors are related or if one is the root
                        > cause for the other error.
                        >
                        >                         The full management server log is uploaded as
                        > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                        >
                        >                         Any help or insight on what went wrong here are
                        > much appreciated.
                        >
                        >                         Thanks
                        >
                        >                         Yiping
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        >
                        
                        -- 
                        
                        Andrija Panić
                        
                    
                    
                
                
            
            
        
        
    
    


RE: Can't start systemVM in a new advanced zone deployment

Posted by Paul Angus <pa...@shapeblue.com>.
Hi Yiping,

do you have snapshots enabled on the NetApp filer?  (it used to be seen as a  ".snapshot"  subdirectory in each directory)

If so try disabling snapshots - there used to be a bug where the .snapshot directory would confuse CloudStack.

paul.angus@shapeblue.com 
www.shapeblue.com
Amadeus House, Floral Street, London  WC2E 9DPUK
@shapeblue
  
 


-----Original Message-----
From: Yiping Zhang <yi...@adobe.com.INVALID> 
Sent: 05 June 2019 23:38
To: users@cloudstack.apache.org
Subject: Re: Can't start systemVM in a new advanced zone deployment

Hi, Sergey:

I found more logs in vpxa.log ( the esxi hosts are using UTC time zone,  so I was looking at wrong time periods earlier).  I have uploaded more logs into pastebin.

From these log entries,  it appears that when copying template to VM,  it tried to open destination VMDK file and got error file not found.  

In case that the CloudStack attempted to create a systemVM,  the destination VMDK file path it is looking for is "<datastore>/<disk-name>/<disk-name>.vmdk",  see uploaded log at https://pastebin.com/aFysZkTy

In case when I manually created new VM from a (different) template in vCenter UI,   the destination VMDK file path it is looking for is "<datastore>/<VM-NAME>/<VM-NAME>.vmdk", see uploaded log at https://pastebin.com/yHcsD8xB

So, I am confused as to how the path for destination VMDK was determined and by CloudStack or VMware, how did I end up with this?

Yiping


On 6/5/19, 12:32 PM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 
    
    
    On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        Hi, Sergey:
        
        During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
        
        2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
        2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
        2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
        
        The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
        
        Yiping
        
        
        On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        
            This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
            
            
            On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            
                Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
                
                Yiping
                
                On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
                
                    Yiping,
                    
                    if you are sure you can reproduce the issue, it would be good to raise a
                    GitHub issue and provide as much detail as possible.
                    
                    Andrija
                    
                    On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                    wrote:
                    
                    > Hi, Sergey:
                    >
                    > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                    > able to create and start system VM instances.    However,  I feel that the
                    > underlying problem still exists, and I am just working around it instead of
                    > fixing it,  because in my lab CloudStack instance with the same version of
                    > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                    > working as expected.
                    >
                    > I did some reading on VMware docs regarding full clone vs. linked clone.
                    > It seems that the best practice is to use full clone for production,
                    > especially if there are high rates of changes to the disks.  So
                    > eventually,  I need to understand and fix the root cause for this issue.
                    > At least for now,  I am over this hurdle and I can move on.
                    >
                    > Thanks again,
                    >
                    > Yiping
                    >
                    > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                    >
                    >     Everything looks good and consistent including all references in VMDK
                    > and its snapshot. I would try these 2 routes:
                    >     1. Figure out what vSphere error actually means from vmkernel log of
                    > ESX when ACS tries to clone the template. If the same error happens while
                    > doing it outside of ACS then a support case with VMware can be an option
                    >     2. Try using link clones. This can be done by this global setting and
                    > restarting management server
                    >     vmware.create.full.clone                    false
                    >
                    >
                    >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                    >
                    >         Hi, Sergey:
                    >
                    >         Thanks for the help. By now, I have dropped and recreated DB,
                    > re-deployed this zone multiple times, blow away primary and secondary
                    > storage (including all contents on them) , or just delete template itself
                    > from primary storage, multiple times.  Every time I ended up with the same
                    > error at the same place.
                    >
                    >         The full management server log,  from the point I seeded the
                    > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                    > the zone to let CS to create system VM's and finally disable the zone to
                    > stop infinite loop of trying to recreate failed system VM's,  are posted
                    > at pastebin:
                    >
                    >
                    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                    >
                    >         Here are the content of relevant files for the template on primary
                    > storage:
                    >
                    >         1) /vmfsvolumes:
                    >
                    >         ls -l /vmfs/volumes/
                    >         total 2052
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 414f6a73-87cd6dac-9585-133ddd409762
                    >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                    > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                    > the NFS datastore for primary storage
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                    >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                    > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                    >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                    > afc5e946-03bfe3c2
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > b70c377c-54a9d28a-6a7b-3f462a475f73
                    >
                    >         2) content in template dir on primary storage:
                    >
                    >         ls -l
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                    >         total 1154596
                    >         -rw-------    1 root     root          8192 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                    >         -rw-------    1 root     root           366 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                    >         -rw-------    1 root     root          9711 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                    >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                    >         -rw-------    1 root     root           518 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                    >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                    >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                    >
                    >         3) *.vmdk file content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                    >         # Disk DescriptorFile
                    >         version=1
                    >         encoding="UTF-8"
                    >         CID=ecb01275
                    >         parentCID=ffffffff
                    >         isNativeSnapshot="no"
                    >         createType="vmfs"
                    >
                    >         # Extent description
                    >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                    >
                    >         # The Disk Data Base
                    >         #DDB
                    >
                    >         ddb.adapterType = "lsilogic"
                    >         ddb.geometry.cylinders = "4063"
                    >         ddb.geometry.heads = "16"
                    >         ddb.geometry.sectors = "63"
                    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                    >         ddb.thinProvisioned = "1"
                    >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                    >         ddb.virtualHWVersion = "13"
                    >
                    >         4) *-0001.vmdk content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >         # Disk DescriptorFile
                    >         version=1
                    >         encoding="UTF-8"
                    >         CID=ecb01275
                    >         parentCID=ecb01275
                    >         isNativeSnapshot="no"
                    >         createType="vmfsSparse"
                    >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         # Extent description
                    >         RW 4096000 VMFSSPARSE
                    > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                    >
                    >         # The Disk Data Base
                    >         #DDB
                    >
                    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                    >
                    >
                    >         5) *.vmtx content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                    >
                    >         .encoding = "UTF-8"
                    >         config.version = "8"
                    >         virtualHW.version = "8"
                    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                    >         pciBridge0.present = "TRUE"
                    >         svga.present = "TRUE"
                    >         pciBridge4.present = "TRUE"
                    >         pciBridge4.virtualDev = "pcieRootPort"
                    >         pciBridge4.functions = "8"
                    >         pciBridge5.present = "TRUE"
                    >         pciBridge5.virtualDev = "pcieRootPort"
                    >         pciBridge5.functions = "8"
                    >         pciBridge6.present = "TRUE"
                    >         pciBridge6.virtualDev = "pcieRootPort"
                    >         pciBridge6.functions = "8"
                    >         pciBridge7.present = "TRUE"
                    >         pciBridge7.virtualDev = "pcieRootPort"
                    >         pciBridge7.functions = "8"
                    >         vmci0.present = "TRUE"
                    >         hpet0.present = "TRUE"
                    >         floppy0.present = "FALSE"
                    >         memSize = "256"
                    >         scsi0.virtualDev = "lsilogic"
                    >         scsi0.present = "TRUE"
                    >         ide0:0.startConnected = "FALSE"
                    >         ide0:0.deviceType = "atapi-cdrom"
                    >         ide0:0.fileName = "CD/DVD drive 0"
                    >         ide0:0.present = "TRUE"
                    >         scsi0:0.deviceType = "scsi-hardDisk"
                    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                    >         scsi0:0.present = "TRUE"
                    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                    >         guestOS = "otherlinux-64"
                    >         toolScripts.afterPowerOn = "TRUE"
                    >         toolScripts.afterResume = "TRUE"
                    >         toolScripts.beforeSuspend = "TRUE"
                    >         toolScripts.beforePowerOff = "TRUE"
                    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                    >         firmware = "bios"
                    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                    >
                    >
                    >         6) *.vmsd file content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                    >         .encoding = "UTF-8"
                    >         snapshot.lastUID = "1"
                    >         snapshot.current = "1"
                    >         snapshot0.uid = "1"
                    >         snapshot0.filename =
                    > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                    >         snapshot0.displayName = "cloud.template.base"
                    >         snapshot0.description = "Base snapshot"
                    >         snapshot0.createTimeHigh = "363123"
                    >         snapshot0.createTimeLow = "-679076964"
                    >         snapshot0.numDisks = "1"
                    >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         snapshot0.disk0.node = "scsi0:0"
                    >         snapshot.numSnapshots = "1"
                    >
                    >         7) *-Snapshot1.vmsn content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                    >
                    >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                    >         config.version = "8"
                    >         virtualHW.version = "8"
                    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                    >         pciBridge0.present = "TRUE"
                    >         svga.present = "TRUE"
                    >         pciBridge4.present = "TRUE"
                    >         pciBridge4.virtualDev = "pcieRootPort"
                    >         pciBridge4.functions = "8"
                    >         pciBridge5.present = "TRUE"
                    >         pciBridge5.virtualDev = "pcieRootPort"
                    >         pciBridge5.functions = "8"
                    >         pciBridge6.present = "TRUE"
                    >         pciBridge6.virtualDev = "pcieRootPort"
                    >         pciBridge6.functions = "8"
                    >         pciBridge7.present = "TRUE"
                    >         pciBridge7.virtualDev = "pcieRootPort"
                    >         pciBridge7.functions = "8"
                    >         vmci0.present = "TRUE"
                    >         hpet0.present = "TRUE"
                    >         floppy0.present = "FALSE"
                    >         memSize = "256"
                    >         scsi0.virtualDev = "lsilogic"
                    >         scsi0.present = "TRUE"
                    >         ide0:0.startConnected = "FALSE"
                    >         ide0:0.deviceType = "atapi-cdrom"
                    >         ide0:0.fileName = "CD/DVD drive 0"
                    >         ide0:0.present = "TRUE"
                    >         scsi0:0.deviceType = "scsi-hardDisk"
                    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         scsi0:0.present = "TRUE"
                    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                    >         guestOS = "otherlinux-64"
                    >         toolScripts.afterPowerOn = "TRUE"
                    >         toolScripts.afterResume = "TRUE"
                    >         toolScripts.beforeSuspend = "TRUE"
                    >         toolScripts.beforePowerOff = "TRUE"
                    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                    >         firmware = "bios"
                    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                    >
                    >
                    >         ------------
                    >
                    >         That's all the data on the template VMDK.
                    >
                    >         Much appreciate your time!
                    >
                    >         Yiping
                    >
                    >
                    >
                    >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                    > wrote:
                    >
                    >             Have you tried deleting template from PS and let ACS to recopy
                    > it again? If the issue is reproducible we can try to look what is wrong
                    > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                    > copying template). Also from one of your ESX hosts output of this
                    >             ls -al /vmfs/volumes
                    >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                    > equitant after ACS finishes copying template)
                    >
                    >              Can you also post management server log starting from the
                    > point you unregister and delete template from the vCenter.
                    >
                    >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                    > wrote:
                    >
                    >                 I have manually imported the OVA to vCenter and
                    > successfully cloned a VM instance with it, on the same NFS datastore.
                    >
                    >
                    >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                    > serg38l@hotmail.com> wrote:
                    >
                    >                     I would suspect the template is corrupted on the
                    > secondary storage. You can try disabling/enabling link clone feature and
                    > see if it works the other way.
                    >                     vmware.create.full.clone                    false
                    >
                    >                     Also systemVM template might have been generated on a
                    > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                    > to validate this is to manually deploy OVA that is in Secondary storage and
                    > try to spin up VM from it directly in vCenter.
                    >
                    >
                    >
                    >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                    > <yi...@adobe.com.INVALID> wrote:
                    >
                    >                         Hi, list:
                    >
                    >                         I am struggling with deploying a new advanced zone
                    > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                    > storage devices. The initial setup of CS management server, seeding of
                    > systemVM template, and advanced zone deployment all went smoothly.
                    >
                    >                         Once I enabled the zone in web UI and the systemVM
                    > template gets copied/staged on to primary storage device. But subsequent VM
                    > creations from this template would fail with errors:
                    >
                    >
                    >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                    > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                    > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                    > cache
                    >
                    >                         2019-06-03 18:38:17,017 INFO
                    > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                    > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                    > successfully reconfigured
                    >
                    >                         2019-06-03 18:38:17,128 INFO
                    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                    > clone from template
                    >
                    >                         2019-06-03 18:38:17,657 INFO
                    > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                    > [ignored]failed toi get message for exception: Error caused by file
                    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >                         2019-06-03 18:38:17,658 ERROR
                    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                    > from base image failed due to Exception: java.lang.RuntimeException
                    >
                    >                         Message: Error caused by file
                    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >
                    >
                    >                         If I try to create “new VM from template”
                    > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                    > exactly the same error message. The name of the VMDK file in the error
                    > message is a snapshot of the base disk image, but it is not part of the
                    > original template OVA on the secondary storage.  So, in the process of
                    > copying the template from secondary to primary storage, a snapshot got
                    > created and the disk became corrupted/unusable.
                    >
                    >                         Much later in the log file,  there is another
                    > error message “failed to fetch any free public IP address” (for ssvm, I
                    > think).  I don’t know if these two errors are related or if one is the root
                    > cause for the other error.
                    >
                    >                         The full management server log is uploaded as
                    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                    >
                    >                         Any help or insight on what went wrong here are
                    > much appreciated.
                    >
                    >                         Thanks
                    >
                    >                         Yiping
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    
                    -- 
                    
                    Andrija Panić
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
Hi, Sergey:

I found more logs in vpxa.log ( the esxi hosts are using UTC time zone,  so I was looking at wrong time periods earlier).  I have uploaded more logs into pastebin.

From these log entries,  it appears that when copying template to VM,  it tried to open destination VMDK file and got error file not found.  

In case that the CloudStack attempted to create a systemVM,  the destination VMDK file path it is looking for is "<datastore>/<disk-name>/<disk-name>.vmdk",  see uploaded log at https://pastebin.com/aFysZkTy

In case when I manually created new VM from a (different) template in vCenter UI,   the destination VMDK file path it is looking for is "<datastore>/<VM-NAME>/<VM-NAME>.vmdk", see uploaded log at https://pastebin.com/yHcsD8xB

So, I am confused as to how the path for destination VMDK was determined and by CloudStack or VMware, how did I end up with this?

Yiping


On 6/5/19, 12:32 PM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 
    
    
    On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        Hi, Sergey:
        
        During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
        
        2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
        2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
        2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
        
        The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
        
        Yiping
        
        
        On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        
            This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
            
            
            On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            
                Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
                
                Yiping
                
                On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
                
                    Yiping,
                    
                    if you are sure you can reproduce the issue, it would be good to raise a
                    GitHub issue and provide as much detail as possible.
                    
                    Andrija
                    
                    On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                    wrote:
                    
                    > Hi, Sergey:
                    >
                    > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                    > able to create and start system VM instances.    However,  I feel that the
                    > underlying problem still exists, and I am just working around it instead of
                    > fixing it,  because in my lab CloudStack instance with the same version of
                    > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                    > working as expected.
                    >
                    > I did some reading on VMware docs regarding full clone vs. linked clone.
                    > It seems that the best practice is to use full clone for production,
                    > especially if there are high rates of changes to the disks.  So
                    > eventually,  I need to understand and fix the root cause for this issue.
                    > At least for now,  I am over this hurdle and I can move on.
                    >
                    > Thanks again,
                    >
                    > Yiping
                    >
                    > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                    >
                    >     Everything looks good and consistent including all references in VMDK
                    > and its snapshot. I would try these 2 routes:
                    >     1. Figure out what vSphere error actually means from vmkernel log of
                    > ESX when ACS tries to clone the template. If the same error happens while
                    > doing it outside of ACS then a support case with VMware can be an option
                    >     2. Try using link clones. This can be done by this global setting and
                    > restarting management server
                    >     vmware.create.full.clone                    false
                    >
                    >
                    >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                    >
                    >         Hi, Sergey:
                    >
                    >         Thanks for the help. By now, I have dropped and recreated DB,
                    > re-deployed this zone multiple times, blow away primary and secondary
                    > storage (including all contents on them) , or just delete template itself
                    > from primary storage, multiple times.  Every time I ended up with the same
                    > error at the same place.
                    >
                    >         The full management server log,  from the point I seeded the
                    > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                    > the zone to let CS to create system VM's and finally disable the zone to
                    > stop infinite loop of trying to recreate failed system VM's,  are posted
                    > at pastebin:
                    >
                    >
                    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                    >
                    >         Here are the content of relevant files for the template on primary
                    > storage:
                    >
                    >         1) /vmfsvolumes:
                    >
                    >         ls -l /vmfs/volumes/
                    >         total 2052
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 414f6a73-87cd6dac-9585-133ddd409762
                    >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                    > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                    > the NFS datastore for primary storage
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                    >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                    > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                    >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                    > afc5e946-03bfe3c2
                    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                    > b70c377c-54a9d28a-6a7b-3f462a475f73
                    >
                    >         2) content in template dir on primary storage:
                    >
                    >         ls -l
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                    >         total 1154596
                    >         -rw-------    1 root     root          8192 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                    >         -rw-------    1 root     root           366 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                    >         -rw-------    1 root     root          9711 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                    >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                    >         -rw-------    1 root     root           518 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                    >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                    >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                    >
                    >         3) *.vmdk file content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                    >         # Disk DescriptorFile
                    >         version=1
                    >         encoding="UTF-8"
                    >         CID=ecb01275
                    >         parentCID=ffffffff
                    >         isNativeSnapshot="no"
                    >         createType="vmfs"
                    >
                    >         # Extent description
                    >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                    >
                    >         # The Disk Data Base
                    >         #DDB
                    >
                    >         ddb.adapterType = "lsilogic"
                    >         ddb.geometry.cylinders = "4063"
                    >         ddb.geometry.heads = "16"
                    >         ddb.geometry.sectors = "63"
                    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                    >         ddb.thinProvisioned = "1"
                    >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                    >         ddb.virtualHWVersion = "13"
                    >
                    >         4) *-0001.vmdk content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >         # Disk DescriptorFile
                    >         version=1
                    >         encoding="UTF-8"
                    >         CID=ecb01275
                    >         parentCID=ecb01275
                    >         isNativeSnapshot="no"
                    >         createType="vmfsSparse"
                    >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         # Extent description
                    >         RW 4096000 VMFSSPARSE
                    > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                    >
                    >         # The Disk Data Base
                    >         #DDB
                    >
                    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                    >
                    >
                    >         5) *.vmtx content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                    >
                    >         .encoding = "UTF-8"
                    >         config.version = "8"
                    >         virtualHW.version = "8"
                    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                    >         pciBridge0.present = "TRUE"
                    >         svga.present = "TRUE"
                    >         pciBridge4.present = "TRUE"
                    >         pciBridge4.virtualDev = "pcieRootPort"
                    >         pciBridge4.functions = "8"
                    >         pciBridge5.present = "TRUE"
                    >         pciBridge5.virtualDev = "pcieRootPort"
                    >         pciBridge5.functions = "8"
                    >         pciBridge6.present = "TRUE"
                    >         pciBridge6.virtualDev = "pcieRootPort"
                    >         pciBridge6.functions = "8"
                    >         pciBridge7.present = "TRUE"
                    >         pciBridge7.virtualDev = "pcieRootPort"
                    >         pciBridge7.functions = "8"
                    >         vmci0.present = "TRUE"
                    >         hpet0.present = "TRUE"
                    >         floppy0.present = "FALSE"
                    >         memSize = "256"
                    >         scsi0.virtualDev = "lsilogic"
                    >         scsi0.present = "TRUE"
                    >         ide0:0.startConnected = "FALSE"
                    >         ide0:0.deviceType = "atapi-cdrom"
                    >         ide0:0.fileName = "CD/DVD drive 0"
                    >         ide0:0.present = "TRUE"
                    >         scsi0:0.deviceType = "scsi-hardDisk"
                    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                    >         scsi0:0.present = "TRUE"
                    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                    >         guestOS = "otherlinux-64"
                    >         toolScripts.afterPowerOn = "TRUE"
                    >         toolScripts.afterResume = "TRUE"
                    >         toolScripts.beforeSuspend = "TRUE"
                    >         toolScripts.beforePowerOff = "TRUE"
                    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                    >         firmware = "bios"
                    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                    >
                    >
                    >         6) *.vmsd file content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                    >         .encoding = "UTF-8"
                    >         snapshot.lastUID = "1"
                    >         snapshot.current = "1"
                    >         snapshot0.uid = "1"
                    >         snapshot0.filename =
                    > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                    >         snapshot0.displayName = "cloud.template.base"
                    >         snapshot0.description = "Base snapshot"
                    >         snapshot0.createTimeHigh = "363123"
                    >         snapshot0.createTimeLow = "-679076964"
                    >         snapshot0.numDisks = "1"
                    >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         snapshot0.disk0.node = "scsi0:0"
                    >         snapshot.numSnapshots = "1"
                    >
                    >         7) *-Snapshot1.vmsn content:
                    >
                    >         cat
                    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                    >
                    >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                    >         config.version = "8"
                    >         virtualHW.version = "8"
                    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                    >         pciBridge0.present = "TRUE"
                    >         svga.present = "TRUE"
                    >         pciBridge4.present = "TRUE"
                    >         pciBridge4.virtualDev = "pcieRootPort"
                    >         pciBridge4.functions = "8"
                    >         pciBridge5.present = "TRUE"
                    >         pciBridge5.virtualDev = "pcieRootPort"
                    >         pciBridge5.functions = "8"
                    >         pciBridge6.present = "TRUE"
                    >         pciBridge6.virtualDev = "pcieRootPort"
                    >         pciBridge6.functions = "8"
                    >         pciBridge7.present = "TRUE"
                    >         pciBridge7.virtualDev = "pcieRootPort"
                    >         pciBridge7.functions = "8"
                    >         vmci0.present = "TRUE"
                    >         hpet0.present = "TRUE"
                    >         floppy0.present = "FALSE"
                    >         memSize = "256"
                    >         scsi0.virtualDev = "lsilogic"
                    >         scsi0.present = "TRUE"
                    >         ide0:0.startConnected = "FALSE"
                    >         ide0:0.deviceType = "atapi-cdrom"
                    >         ide0:0.fileName = "CD/DVD drive 0"
                    >         ide0:0.present = "TRUE"
                    >         scsi0:0.deviceType = "scsi-hardDisk"
                    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                    >         scsi0:0.present = "TRUE"
                    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                    >         guestOS = "otherlinux-64"
                    >         toolScripts.afterPowerOn = "TRUE"
                    >         toolScripts.afterResume = "TRUE"
                    >         toolScripts.beforeSuspend = "TRUE"
                    >         toolScripts.beforePowerOff = "TRUE"
                    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                    >         firmware = "bios"
                    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                    >
                    >
                    >         ------------
                    >
                    >         That's all the data on the template VMDK.
                    >
                    >         Much appreciate your time!
                    >
                    >         Yiping
                    >
                    >
                    >
                    >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                    > wrote:
                    >
                    >             Have you tried deleting template from PS and let ACS to recopy
                    > it again? If the issue is reproducible we can try to look what is wrong
                    > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                    > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                    > copying template). Also from one of your ESX hosts output of this
                    >             ls -al /vmfs/volumes
                    >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                    > equitant after ACS finishes copying template)
                    >
                    >              Can you also post management server log starting from the
                    > point you unregister and delete template from the vCenter.
                    >
                    >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                    > wrote:
                    >
                    >                 I have manually imported the OVA to vCenter and
                    > successfully cloned a VM instance with it, on the same NFS datastore.
                    >
                    >
                    >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                    > serg38l@hotmail.com> wrote:
                    >
                    >                     I would suspect the template is corrupted on the
                    > secondary storage. You can try disabling/enabling link clone feature and
                    > see if it works the other way.
                    >                     vmware.create.full.clone                    false
                    >
                    >                     Also systemVM template might have been generated on a
                    > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                    > to validate this is to manually deploy OVA that is in Secondary storage and
                    > try to spin up VM from it directly in vCenter.
                    >
                    >
                    >
                    >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                    > <yi...@adobe.com.INVALID> wrote:
                    >
                    >                         Hi, list:
                    >
                    >                         I am struggling with deploying a new advanced zone
                    > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                    > storage devices. The initial setup of CS management server, seeding of
                    > systemVM template, and advanced zone deployment all went smoothly.
                    >
                    >                         Once I enabled the zone in web UI and the systemVM
                    > template gets copied/staged on to primary storage device. But subsequent VM
                    > creations from this template would fail with errors:
                    >
                    >
                    >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                    > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                    > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                    > cache
                    >
                    >                         2019-06-03 18:38:17,017 INFO
                    > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                    > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                    > successfully reconfigured
                    >
                    >                         2019-06-03 18:38:17,128 INFO
                    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                    > clone from template
                    >
                    >                         2019-06-03 18:38:17,657 INFO
                    > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                    > [ignored]failed toi get message for exception: Error caused by file
                    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >                         2019-06-03 18:38:17,658 ERROR
                    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                    > from base image failed due to Exception: java.lang.RuntimeException
                    >
                    >                         Message: Error caused by file
                    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    >
                    >
                    >
                    >                         If I try to create “new VM from template”
                    > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                    > exactly the same error message. The name of the VMDK file in the error
                    > message is a snapshot of the base disk image, but it is not part of the
                    > original template OVA on the secondary storage.  So, in the process of
                    > copying the template from secondary to primary storage, a snapshot got
                    > created and the disk became corrupted/unusable.
                    >
                    >                         Much later in the log file,  there is another
                    > error message “failed to fetch any free public IP address” (for ssvm, I
                    > think).  I don’t know if these two errors are related or if one is the root
                    > cause for the other error.
                    >
                    >                         The full management server log is uploaded as
                    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C5ca1ec6f2130421b49a308d6e9ec7cbb%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953599284884679&amp;sdata=0DkDZZOatpBC%2BtUwau%2BJY6rt7PW%2BHzM2Ns2KpFh4%2BcM%3D&amp;reserved=0
                    >
                    >                         Any help or insight on what went wrong here are
                    > much appreciated.
                    >
                    >                         Thanks
                    >
                    >                         Yiping
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    >
                    
                    -- 
                    
                    Andrija Panić
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
Some operations log get transferred to vCenter log vpxd.log. It is not straightforward to trace I but Vmware will be able to help should you open case with them. 


On 6/5/19, 11:39 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    Hi, Sergey:
    
    During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:
    
    2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
    2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
    2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
    
    The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.
    
    Yiping
    
    
    On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    
        This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
        
        
        On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
        
            Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
            
            Yiping
            
            On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
            
                Yiping,
                
                if you are sure you can reproduce the issue, it would be good to raise a
                GitHub issue and provide as much detail as possible.
                
                Andrija
                
                On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
                wrote:
                
                > Hi, Sergey:
                >
                > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
                > able to create and start system VM instances.    However,  I feel that the
                > underlying problem still exists, and I am just working around it instead of
                > fixing it,  because in my lab CloudStack instance with the same version of
                > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
                > working as expected.
                >
                > I did some reading on VMware docs regarding full clone vs. linked clone.
                > It seems that the best practice is to use full clone for production,
                > especially if there are high rates of changes to the disks.  So
                > eventually,  I need to understand and fix the root cause for this issue.
                > At least for now,  I am over this hurdle and I can move on.
                >
                > Thanks again,
                >
                > Yiping
                >
                > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                >
                >     Everything looks good and consistent including all references in VMDK
                > and its snapshot. I would try these 2 routes:
                >     1. Figure out what vSphere error actually means from vmkernel log of
                > ESX when ACS tries to clone the template. If the same error happens while
                > doing it outside of ACS then a support case with VMware can be an option
                >     2. Try using link clones. This can be done by this global setting and
                > restarting management server
                >     vmware.create.full.clone                    false
                >
                >
                >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                >
                >         Hi, Sergey:
                >
                >         Thanks for the help. By now, I have dropped and recreated DB,
                > re-deployed this zone multiple times, blow away primary and secondary
                > storage (including all contents on them) , or just delete template itself
                > from primary storage, multiple times.  Every time I ended up with the same
                > error at the same place.
                >
                >         The full management server log,  from the point I seeded the
                > systemvmtemplate for vmware, to deploying a new advanced zone and enable
                > the zone to let CS to create system VM's and finally disable the zone to
                > stop infinite loop of trying to recreate failed system VM's,  are posted
                > at pastebin:
                >
                >
                > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C44530fc614da4d42aeb208d6e9e1bf07%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953553116209842&amp;sdata=oQLChzVf11KaM4bsFV9rraYkT%2F96AIhfR3SNQkpOBhs%3D&amp;reserved=0
                >
                >         Here are the content of relevant files for the template on primary
                > storage:
                >
                >         1) /vmfsvolumes:
                >
                >         ls -l /vmfs/volumes/
                >         total 2052
                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                > 414f6a73-87cd6dac-9585-133ddd409762
                >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
                > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
                > the NFS datastore for primary storage
                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
                >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
                > 5cd4b471-c2318b91-8fb2-00215a9b31c0
                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                > 5cd4b471-da49a95b-bdb6-00215a9b31c0
                >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
                > afc5e946-03bfe3c2
                >         drwxr-xr-x    1 root     root             8 Jan  1  1970
                > b70c377c-54a9d28a-6a7b-3f462a475f73
                >
                >         2) content in template dir on primary storage:
                >
                >         ls -l
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
                >         total 1154596
                >         -rw-------    1 root     root          8192 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
                >         -rw-------    1 root     root           366 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                >         -rw-r--r--    1 root     root           268 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
                >         -rw-------    1 root     root          9711 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                >         -rw-------    1 root     root     2097152000 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
                >         -rw-------    1 root     root           518 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                >         -rw-r--r--    1 root     root           471 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
                > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                >
                >         3) *.vmdk file content:
                >
                >         cat
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
                >         # Disk DescriptorFile
                >         version=1
                >         encoding="UTF-8"
                >         CID=ecb01275
                >         parentCID=ffffffff
                >         isNativeSnapshot="no"
                >         createType="vmfs"
                >
                >         # Extent description
                >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
                >
                >         # The Disk Data Base
                >         #DDB
                >
                >         ddb.adapterType = "lsilogic"
                >         ddb.geometry.cylinders = "4063"
                >         ddb.geometry.heads = "16"
                >         ddb.geometry.sectors = "63"
                >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                >         ddb.thinProvisioned = "1"
                >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
                >         ddb.virtualHWVersion = "13"
                >
                >         4) *-0001.vmdk content:
                >
                >         cat
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                >
                >         # Disk DescriptorFile
                >         version=1
                >         encoding="UTF-8"
                >         CID=ecb01275
                >         parentCID=ecb01275
                >         isNativeSnapshot="no"
                >         createType="vmfsSparse"
                >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                >         # Extent description
                >         RW 4096000 VMFSSPARSE
                > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
                >
                >         # The Disk Data Base
                >         #DDB
                >
                >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
                >
                >
                >         5) *.vmtx content:
                >
                >         cat
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
                >
                >         .encoding = "UTF-8"
                >         config.version = "8"
                >         virtualHW.version = "8"
                >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                >         pciBridge0.present = "TRUE"
                >         svga.present = "TRUE"
                >         pciBridge4.present = "TRUE"
                >         pciBridge4.virtualDev = "pcieRootPort"
                >         pciBridge4.functions = "8"
                >         pciBridge5.present = "TRUE"
                >         pciBridge5.virtualDev = "pcieRootPort"
                >         pciBridge5.functions = "8"
                >         pciBridge6.present = "TRUE"
                >         pciBridge6.virtualDev = "pcieRootPort"
                >         pciBridge6.functions = "8"
                >         pciBridge7.present = "TRUE"
                >         pciBridge7.virtualDev = "pcieRootPort"
                >         pciBridge7.functions = "8"
                >         vmci0.present = "TRUE"
                >         hpet0.present = "TRUE"
                >         floppy0.present = "FALSE"
                >         memSize = "256"
                >         scsi0.virtualDev = "lsilogic"
                >         scsi0.present = "TRUE"
                >         ide0:0.startConnected = "FALSE"
                >         ide0:0.deviceType = "atapi-cdrom"
                >         ide0:0.fileName = "CD/DVD drive 0"
                >         ide0:0.present = "TRUE"
                >         scsi0:0.deviceType = "scsi-hardDisk"
                >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
                >         scsi0:0.present = "TRUE"
                >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                >         guestOS = "otherlinux-64"
                >         toolScripts.afterPowerOn = "TRUE"
                >         toolScripts.afterResume = "TRUE"
                >         toolScripts.beforeSuspend = "TRUE"
                >         toolScripts.beforePowerOff = "TRUE"
                >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                >         firmware = "bios"
                >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                >
                >
                >         6) *.vmsd file content:
                >
                >         cat
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
                >         .encoding = "UTF-8"
                >         snapshot.lastUID = "1"
                >         snapshot.current = "1"
                >         snapshot0.uid = "1"
                >         snapshot0.filename =
                > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
                >         snapshot0.displayName = "cloud.template.base"
                >         snapshot0.description = "Base snapshot"
                >         snapshot0.createTimeHigh = "363123"
                >         snapshot0.createTimeLow = "-679076964"
                >         snapshot0.numDisks = "1"
                >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                >         snapshot0.disk0.node = "scsi0:0"
                >         snapshot.numSnapshots = "1"
                >
                >         7) *-Snapshot1.vmsn content:
                >
                >         cat
                > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
                >
                >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
                >         config.version = "8"
                >         virtualHW.version = "8"
                >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
                >         pciBridge0.present = "TRUE"
                >         svga.present = "TRUE"
                >         pciBridge4.present = "TRUE"
                >         pciBridge4.virtualDev = "pcieRootPort"
                >         pciBridge4.functions = "8"
                >         pciBridge5.present = "TRUE"
                >         pciBridge5.virtualDev = "pcieRootPort"
                >         pciBridge5.functions = "8"
                >         pciBridge6.present = "TRUE"
                >         pciBridge6.virtualDev = "pcieRootPort"
                >         pciBridge6.functions = "8"
                >         pciBridge7.present = "TRUE"
                >         pciBridge7.virtualDev = "pcieRootPort"
                >         pciBridge7.functions = "8"
                >         vmci0.present = "TRUE"
                >         hpet0.present = "TRUE"
                >         floppy0.present = "FALSE"
                >         memSize = "256"
                >         scsi0.virtualDev = "lsilogic"
                >         scsi0.present = "TRUE"
                >         ide0:0.startConnected = "FALSE"
                >         ide0:0.deviceType = "atapi-cdrom"
                >         ide0:0.fileName = "CD/DVD drive 0"
                >         ide0:0.present = "TRUE"
                >         scsi0:0.deviceType = "scsi-hardDisk"
                >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
                >         scsi0:0.present = "TRUE"
                >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
                >         annotation = "systemvmtemplate-4.11.2.0-vmware"
                >         guestOS = "otherlinux-64"
                >         toolScripts.afterPowerOn = "TRUE"
                >         toolScripts.afterResume = "TRUE"
                >         toolScripts.beforeSuspend = "TRUE"
                >         toolScripts.beforePowerOff = "TRUE"
                >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
                >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
                >         firmware = "bios"
                >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
                >
                >
                >         ------------
                >
                >         That's all the data on the template VMDK.
                >
                >         Much appreciate your time!
                >
                >         Yiping
                >
                >
                >
                >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
                > wrote:
                >
                >             Have you tried deleting template from PS and let ACS to recopy
                > it again? If the issue is reproducible we can try to look what is wrong
                > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
                > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
                > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
                > copying template). Also from one of your ESX hosts output of this
                >             ls -al /vmfs/volumes
                >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
                > equitant after ACS finishes copying template)
                >
                >              Can you also post management server log starting from the
                > point you unregister and delete template from the vCenter.
                >
                >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
                > wrote:
                >
                >                 I have manually imported the OVA to vCenter and
                > successfully cloned a VM instance with it, on the same NFS datastore.
                >
                >
                >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
                > serg38l@hotmail.com> wrote:
                >
                >                     I would suspect the template is corrupted on the
                > secondary storage. You can try disabling/enabling link clone feature and
                > see if it works the other way.
                >                     vmware.create.full.clone                    false
                >
                >                     Also systemVM template might have been generated on a
                > newer version of vSphere and not compatible with ESXi 6.5. What you can do
                > to validate this is to manually deploy OVA that is in Secondary storage and
                > try to spin up VM from it directly in vCenter.
                >
                >
                >
                >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
                > <yi...@adobe.com.INVALID> wrote:
                >
                >                         Hi, list:
                >
                >                         I am struggling with deploying a new advanced zone
                > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
                > storage devices. The initial setup of CS management server, seeding of
                > systemVM template, and advanced zone deployment all went smoothly.
                >
                >                         Once I enabled the zone in web UI and the systemVM
                > template gets copied/staged on to primary storage device. But subsequent VM
                > creations from this template would fail with errors:
                >
                >
                >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
                > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
                > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
                > cache
                >
                >                         2019-06-03 18:38:17,017 INFO
                > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
                > successfully reconfigured
                >
                >                         2019-06-03 18:38:17,128 INFO
                > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
                > clone from template
                >
                >                         2019-06-03 18:38:17,657 INFO
                > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
                > [ignored]failed toi get message for exception: Error caused by file
                > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                >
                >                         2019-06-03 18:38:17,658 ERROR
                > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
                > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
                > from base image failed due to Exception: java.lang.RuntimeException
                >
                >                         Message: Error caused by file
                > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                >
                >
                >
                >                         If I try to create “new VM from template”
                > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
                > exactly the same error message. The name of the VMDK file in the error
                > message is a snapshot of the base disk image, but it is not part of the
                > original template OVA on the secondary storage.  So, in the process of
                > copying the template from secondary to primary storage, a snapshot got
                > created and the disk became corrupted/unusable.
                >
                >                         Much later in the log file,  there is another
                > error message “failed to fetch any free public IP address” (for ssvm, I
                > think).  I don’t know if these two errors are related or if one is the root
                > cause for the other error.
                >
                >                         The full management server log is uploaded as
                > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C44530fc614da4d42aeb208d6e9e1bf07%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953553116209842&amp;sdata=oQLChzVf11KaM4bsFV9rraYkT%2F96AIhfR3SNQkpOBhs%3D&amp;reserved=0
                >
                >                         Any help or insight on what went wrong here are
                > much appreciated.
                >
                >                         Thanks
                >
                >                         Yiping
                >
                >
                >
                >
                >
                >
                >
                >
                >
                >
                >
                >
                >
                
                -- 
                
                Andrija Panić
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
Hi, Sergey:

During the time period when I had problem cloning template,  there are only a few unique entries in vmkernel.log, and they were repeated hundreds/thousands of times by all the cpu cores:

2019-06-02T16:47:00.633Z cpu9:8491061)FSS: 6751: Failed to open file 'hpilo-d0ccb15'; Requested flags 0x5, world: 8491061 [ams-ahs], (Existing flags 0x5, world: 8491029 [ams-main]): Busy
2019-06-02T16:47:49.320Z cpu1:66415)nhpsa: hpsa_vmkScsiCmdDone:6384: Sense data: error code: 0x70, key: 0x5, info:00 00 00 00 , cmdInfo:00 00 00 00 , CmdSN: 0xd5c, worldId: 0x818e8e, Cmd: 0x85, ASC: 0x20, ASCQ: 0x0
2019-06-02T16:47:49.320Z cpu1:66415)ScsiDeviceIO: 2948: Cmd(0x43954115be40) 0x85, CmdSN 0xd5c from world 8490638 to dev "naa.600508b1001c6d77d7dd6a0cc0953df1" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

The device " naa.600508b1001c6d77d7dd6a0cc0953df1" is the local disk on this host.

Yiping


On 6/5/19, 11:15 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?
    
    
    On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
        
        Yiping
        
        On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
        
            Yiping,
            
            if you are sure you can reproduce the issue, it would be good to raise a
            GitHub issue and provide as much detail as possible.
            
            Andrija
            
            On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
            wrote:
            
            > Hi, Sergey:
            >
            > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
            > able to create and start system VM instances.    However,  I feel that the
            > underlying problem still exists, and I am just working around it instead of
            > fixing it,  because in my lab CloudStack instance with the same version of
            > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
            > working as expected.
            >
            > I did some reading on VMware docs regarding full clone vs. linked clone.
            > It seems that the best practice is to use full clone for production,
            > especially if there are high rates of changes to the disks.  So
            > eventually,  I need to understand and fix the root cause for this issue.
            > At least for now,  I am over this hurdle and I can move on.
            >
            > Thanks again,
            >
            > Yiping
            >
            > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
            >
            >     Everything looks good and consistent including all references in VMDK
            > and its snapshot. I would try these 2 routes:
            >     1. Figure out what vSphere error actually means from vmkernel log of
            > ESX when ACS tries to clone the template. If the same error happens while
            > doing it outside of ACS then a support case with VMware can be an option
            >     2. Try using link clones. This can be done by this global setting and
            > restarting management server
            >     vmware.create.full.clone                    false
            >
            >
            >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            >
            >         Hi, Sergey:
            >
            >         Thanks for the help. By now, I have dropped and recreated DB,
            > re-deployed this zone multiple times, blow away primary and secondary
            > storage (including all contents on them) , or just delete template itself
            > from primary storage, multiple times.  Every time I ended up with the same
            > error at the same place.
            >
            >         The full management server log,  from the point I seeded the
            > systemvmtemplate for vmware, to deploying a new advanced zone and enable
            > the zone to let CS to create system VM's and finally disable the zone to
            > stop infinite loop of trying to recreate failed system VM's,  are posted
            > at pastebin:
            >
            >
            > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C44530fc614da4d42aeb208d6e9e1bf07%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953553116209842&amp;sdata=oQLChzVf11KaM4bsFV9rraYkT%2F96AIhfR3SNQkpOBhs%3D&amp;reserved=0
            >
            >         Here are the content of relevant files for the template on primary
            > storage:
            >
            >         1) /vmfsvolumes:
            >
            >         ls -l /vmfs/volumes/
            >         total 2052
            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
            > 414f6a73-87cd6dac-9585-133ddd409762
            >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
            > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
            > the NFS datastore for primary storage
            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
            > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
            >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
            > 5cd4b471-c2318b91-8fb2-00215a9b31c0
            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
            > 5cd4b471-da49a95b-bdb6-00215a9b31c0
            >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
            > afc5e946-03bfe3c2
            >         drwxr-xr-x    1 root     root             8 Jan  1  1970
            > b70c377c-54a9d28a-6a7b-3f462a475f73
            >
            >         2) content in template dir on primary storage:
            >
            >         ls -l
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
            >         total 1154596
            >         -rw-------    1 root     root          8192 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
            >         -rw-------    1 root     root           366 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            >         -rw-r--r--    1 root     root           268 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
            >         -rw-------    1 root     root          9711 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
            >         -rw-------    1 root     root     2097152000 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
            >         -rw-------    1 root     root           518 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
            >         -rw-r--r--    1 root     root           471 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
            >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
            > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
            >
            >         3) *.vmdk file content:
            >
            >         cat
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
            >         # Disk DescriptorFile
            >         version=1
            >         encoding="UTF-8"
            >         CID=ecb01275
            >         parentCID=ffffffff
            >         isNativeSnapshot="no"
            >         createType="vmfs"
            >
            >         # Extent description
            >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
            >
            >         # The Disk Data Base
            >         #DDB
            >
            >         ddb.adapterType = "lsilogic"
            >         ddb.geometry.cylinders = "4063"
            >         ddb.geometry.heads = "16"
            >         ddb.geometry.sectors = "63"
            >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
            >         ddb.thinProvisioned = "1"
            >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
            >         ddb.virtualHWVersion = "13"
            >
            >         4) *-0001.vmdk content:
            >
            >         cat
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            >
            >         # Disk DescriptorFile
            >         version=1
            >         encoding="UTF-8"
            >         CID=ecb01275
            >         parentCID=ecb01275
            >         isNativeSnapshot="no"
            >         createType="vmfsSparse"
            >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
            >         # Extent description
            >         RW 4096000 VMFSSPARSE
            > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
            >
            >         # The Disk Data Base
            >         #DDB
            >
            >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
            >
            >
            >         5) *.vmtx content:
            >
            >         cat
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
            >
            >         .encoding = "UTF-8"
            >         config.version = "8"
            >         virtualHW.version = "8"
            >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
            >         pciBridge0.present = "TRUE"
            >         svga.present = "TRUE"
            >         pciBridge4.present = "TRUE"
            >         pciBridge4.virtualDev = "pcieRootPort"
            >         pciBridge4.functions = "8"
            >         pciBridge5.present = "TRUE"
            >         pciBridge5.virtualDev = "pcieRootPort"
            >         pciBridge5.functions = "8"
            >         pciBridge6.present = "TRUE"
            >         pciBridge6.virtualDev = "pcieRootPort"
            >         pciBridge6.functions = "8"
            >         pciBridge7.present = "TRUE"
            >         pciBridge7.virtualDev = "pcieRootPort"
            >         pciBridge7.functions = "8"
            >         vmci0.present = "TRUE"
            >         hpet0.present = "TRUE"
            >         floppy0.present = "FALSE"
            >         memSize = "256"
            >         scsi0.virtualDev = "lsilogic"
            >         scsi0.present = "TRUE"
            >         ide0:0.startConnected = "FALSE"
            >         ide0:0.deviceType = "atapi-cdrom"
            >         ide0:0.fileName = "CD/DVD drive 0"
            >         ide0:0.present = "TRUE"
            >         scsi0:0.deviceType = "scsi-hardDisk"
            >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
            >         scsi0:0.present = "TRUE"
            >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
            >         annotation = "systemvmtemplate-4.11.2.0-vmware"
            >         guestOS = "otherlinux-64"
            >         toolScripts.afterPowerOn = "TRUE"
            >         toolScripts.afterResume = "TRUE"
            >         toolScripts.beforeSuspend = "TRUE"
            >         toolScripts.beforePowerOff = "TRUE"
            >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
            >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
            >         firmware = "bios"
            >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
            >
            >
            >         6) *.vmsd file content:
            >
            >         cat
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
            >         .encoding = "UTF-8"
            >         snapshot.lastUID = "1"
            >         snapshot.current = "1"
            >         snapshot0.uid = "1"
            >         snapshot0.filename =
            > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
            >         snapshot0.displayName = "cloud.template.base"
            >         snapshot0.description = "Base snapshot"
            >         snapshot0.createTimeHigh = "363123"
            >         snapshot0.createTimeLow = "-679076964"
            >         snapshot0.numDisks = "1"
            >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
            >         snapshot0.disk0.node = "scsi0:0"
            >         snapshot.numSnapshots = "1"
            >
            >         7) *-Snapshot1.vmsn content:
            >
            >         cat
            > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
            >
            >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
            >         config.version = "8"
            >         virtualHW.version = "8"
            >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
            >         pciBridge0.present = "TRUE"
            >         svga.present = "TRUE"
            >         pciBridge4.present = "TRUE"
            >         pciBridge4.virtualDev = "pcieRootPort"
            >         pciBridge4.functions = "8"
            >         pciBridge5.present = "TRUE"
            >         pciBridge5.virtualDev = "pcieRootPort"
            >         pciBridge5.functions = "8"
            >         pciBridge6.present = "TRUE"
            >         pciBridge6.virtualDev = "pcieRootPort"
            >         pciBridge6.functions = "8"
            >         pciBridge7.present = "TRUE"
            >         pciBridge7.virtualDev = "pcieRootPort"
            >         pciBridge7.functions = "8"
            >         vmci0.present = "TRUE"
            >         hpet0.present = "TRUE"
            >         floppy0.present = "FALSE"
            >         memSize = "256"
            >         scsi0.virtualDev = "lsilogic"
            >         scsi0.present = "TRUE"
            >         ide0:0.startConnected = "FALSE"
            >         ide0:0.deviceType = "atapi-cdrom"
            >         ide0:0.fileName = "CD/DVD drive 0"
            >         ide0:0.present = "TRUE"
            >         scsi0:0.deviceType = "scsi-hardDisk"
            >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
            >         scsi0:0.present = "TRUE"
            >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
            >         annotation = "systemvmtemplate-4.11.2.0-vmware"
            >         guestOS = "otherlinux-64"
            >         toolScripts.afterPowerOn = "TRUE"
            >         toolScripts.afterResume = "TRUE"
            >         toolScripts.beforeSuspend = "TRUE"
            >         toolScripts.beforePowerOff = "TRUE"
            >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
            >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
            >         firmware = "bios"
            >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
            >
            >
            >         ------------
            >
            >         That's all the data on the template VMDK.
            >
            >         Much appreciate your time!
            >
            >         Yiping
            >
            >
            >
            >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
            > wrote:
            >
            >             Have you tried deleting template from PS and let ACS to recopy
            > it again? If the issue is reproducible we can try to look what is wrong
            > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
            > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
            > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
            > copying template). Also from one of your ESX hosts output of this
            >             ls -al /vmfs/volumes
            >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
            > equitant after ACS finishes copying template)
            >
            >              Can you also post management server log starting from the
            > point you unregister and delete template from the vCenter.
            >
            >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
            > wrote:
            >
            >                 I have manually imported the OVA to vCenter and
            > successfully cloned a VM instance with it, on the same NFS datastore.
            >
            >
            >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
            > serg38l@hotmail.com> wrote:
            >
            >                     I would suspect the template is corrupted on the
            > secondary storage. You can try disabling/enabling link clone feature and
            > see if it works the other way.
            >                     vmware.create.full.clone                    false
            >
            >                     Also systemVM template might have been generated on a
            > newer version of vSphere and not compatible with ESXi 6.5. What you can do
            > to validate this is to manually deploy OVA that is in Secondary storage and
            > try to spin up VM from it directly in vCenter.
            >
            >
            >
            >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
            > <yi...@adobe.com.INVALID> wrote:
            >
            >                         Hi, list:
            >
            >                         I am struggling with deploying a new advanced zone
            > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
            > storage devices. The initial setup of CS management server, seeding of
            > systemVM template, and advanced zone deployment all went smoothly.
            >
            >                         Once I enabled the zone in web UI and the systemVM
            > template gets copied/staged on to primary storage device. But subsequent VM
            > creations from this template would fail with errors:
            >
            >
            >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
            > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
            > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
            > cache
            >
            >                         2019-06-03 18:38:17,017 INFO
            > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
            > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
            > successfully reconfigured
            >
            >                         2019-06-03 18:38:17,128 INFO
            > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
            > clone from template
            >
            >                         2019-06-03 18:38:17,657 INFO
            > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
            > [ignored]failed toi get message for exception: Error caused by file
            > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            >
            >                         2019-06-03 18:38:17,658 ERROR
            > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
            > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
            > from base image failed due to Exception: java.lang.RuntimeException
            >
            >                         Message: Error caused by file
            > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            >
            >
            >
            >                         If I try to create “new VM from template”
            > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
            > exactly the same error message. The name of the VMDK file in the error
            > message is a snapshot of the base disk image, but it is not part of the
            > original template OVA on the secondary storage.  So, in the process of
            > copying the template from secondary to primary storage, a snapshot got
            > created and the disk became corrupted/unusable.
            >
            >                         Much later in the log file,  there is another
            > error message “failed to fetch any free public IP address” (for ssvm, I
            > think).  I don’t know if these two errors are related or if one is the root
            > cause for the other error.
            >
            >                         The full management server log is uploaded as
            > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C44530fc614da4d42aeb208d6e9e1bf07%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953553116209842&amp;sdata=oQLChzVf11KaM4bsFV9rraYkT%2F96AIhfR3SNQkpOBhs%3D&amp;reserved=0
            >
            >                         Any help or insight on what went wrong here are
            > much appreciated.
            >
            >                         Thanks
            >
            >                         Yiping
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            
            -- 
            
            Andrija Panić
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
This must be specific to that environment.  For a full clone mode ACS simply calls cloneVMTask of vSphere API so basically until cloning of that template succeeds when attmepted in vSphere client  it would keep failing in ACS. Can you post vmkernel.log from your ESX host esx-0001-a-001?


On 6/5/19, 8:47 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.
    
    Yiping
    
    On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:
    
        Yiping,
        
        if you are sure you can reproduce the issue, it would be good to raise a
        GitHub issue and provide as much detail as possible.
        
        Andrija
        
        On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
        wrote:
        
        > Hi, Sergey:
        >
        > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
        > able to create and start system VM instances.    However,  I feel that the
        > underlying problem still exists, and I am just working around it instead of
        > fixing it,  because in my lab CloudStack instance with the same version of
        > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
        > working as expected.
        >
        > I did some reading on VMware docs regarding full clone vs. linked clone.
        > It seems that the best practice is to use full clone for production,
        > especially if there are high rates of changes to the disks.  So
        > eventually,  I need to understand and fix the root cause for this issue.
        > At least for now,  I am over this hurdle and I can move on.
        >
        > Thanks again,
        >
        > Yiping
        >
        > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        >
        >     Everything looks good and consistent including all references in VMDK
        > and its snapshot. I would try these 2 routes:
        >     1. Figure out what vSphere error actually means from vmkernel log of
        > ESX when ACS tries to clone the template. If the same error happens while
        > doing it outside of ACS then a support case with VMware can be an option
        >     2. Try using link clones. This can be done by this global setting and
        > restarting management server
        >     vmware.create.full.clone                    false
        >
        >
        >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
        >
        >         Hi, Sergey:
        >
        >         Thanks for the help. By now, I have dropped and recreated DB,
        > re-deployed this zone multiple times, blow away primary and secondary
        > storage (including all contents on them) , or just delete template itself
        > from primary storage, multiple times.  Every time I ended up with the same
        > error at the same place.
        >
        >         The full management server log,  from the point I seeded the
        > systemvmtemplate for vmware, to deploying a new advanced zone and enable
        > the zone to let CS to create system VM's and finally disable the zone to
        > stop infinite loop of trying to recreate failed system VM's,  are posted
        > at pastebin:
        >
        >
        > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C4c2a536c47ed44e4167908d6e98be65f%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953184420635130&amp;sdata=GFdI2M4NdVoB848FAtYgg9zzfXAjBFYm%2BwafqMTnxWY%3D&amp;reserved=0
        >
        >         Here are the content of relevant files for the template on primary
        > storage:
        >
        >         1) /vmfsvolumes:
        >
        >         ls -l /vmfs/volumes/
        >         total 2052
        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
        > 414f6a73-87cd6dac-9585-133ddd409762
        >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
        > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
        > the NFS datastore for primary storage
        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
        > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
        >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
        > 5cd4b471-c2318b91-8fb2-00215a9b31c0
        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
        > 5cd4b471-da49a95b-bdb6-00215a9b31c0
        >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
        > afc5e946-03bfe3c2
        >         drwxr-xr-x    1 root     root             8 Jan  1  1970
        > b70c377c-54a9d28a-6a7b-3f462a475f73
        >
        >         2) content in template dir on primary storage:
        >
        >         ls -l
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
        >         total 1154596
        >         -rw-------    1 root     root          8192 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
        >         -rw-------    1 root     root           366 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        >         -rw-r--r--    1 root     root           268 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
        >         -rw-------    1 root     root          9711 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
        >         -rw-------    1 root     root     2097152000 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
        >         -rw-------    1 root     root           518 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
        >         -rw-r--r--    1 root     root           471 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
        >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
        > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
        >
        >         3) *.vmdk file content:
        >
        >         cat
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
        >         # Disk DescriptorFile
        >         version=1
        >         encoding="UTF-8"
        >         CID=ecb01275
        >         parentCID=ffffffff
        >         isNativeSnapshot="no"
        >         createType="vmfs"
        >
        >         # Extent description
        >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
        >
        >         # The Disk Data Base
        >         #DDB
        >
        >         ddb.adapterType = "lsilogic"
        >         ddb.geometry.cylinders = "4063"
        >         ddb.geometry.heads = "16"
        >         ddb.geometry.sectors = "63"
        >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
        >         ddb.thinProvisioned = "1"
        >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
        >         ddb.virtualHWVersion = "13"
        >
        >         4) *-0001.vmdk content:
        >
        >         cat
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        >
        >         # Disk DescriptorFile
        >         version=1
        >         encoding="UTF-8"
        >         CID=ecb01275
        >         parentCID=ecb01275
        >         isNativeSnapshot="no"
        >         createType="vmfsSparse"
        >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        >         # Extent description
        >         RW 4096000 VMFSSPARSE
        > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
        >
        >         # The Disk Data Base
        >         #DDB
        >
        >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
        >
        >
        >         5) *.vmtx content:
        >
        >         cat
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
        >
        >         .encoding = "UTF-8"
        >         config.version = "8"
        >         virtualHW.version = "8"
        >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
        >         pciBridge0.present = "TRUE"
        >         svga.present = "TRUE"
        >         pciBridge4.present = "TRUE"
        >         pciBridge4.virtualDev = "pcieRootPort"
        >         pciBridge4.functions = "8"
        >         pciBridge5.present = "TRUE"
        >         pciBridge5.virtualDev = "pcieRootPort"
        >         pciBridge5.functions = "8"
        >         pciBridge6.present = "TRUE"
        >         pciBridge6.virtualDev = "pcieRootPort"
        >         pciBridge6.functions = "8"
        >         pciBridge7.present = "TRUE"
        >         pciBridge7.virtualDev = "pcieRootPort"
        >         pciBridge7.functions = "8"
        >         vmci0.present = "TRUE"
        >         hpet0.present = "TRUE"
        >         floppy0.present = "FALSE"
        >         memSize = "256"
        >         scsi0.virtualDev = "lsilogic"
        >         scsi0.present = "TRUE"
        >         ide0:0.startConnected = "FALSE"
        >         ide0:0.deviceType = "atapi-cdrom"
        >         ide0:0.fileName = "CD/DVD drive 0"
        >         ide0:0.present = "TRUE"
        >         scsi0:0.deviceType = "scsi-hardDisk"
        >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
        >         scsi0:0.present = "TRUE"
        >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
        >         annotation = "systemvmtemplate-4.11.2.0-vmware"
        >         guestOS = "otherlinux-64"
        >         toolScripts.afterPowerOn = "TRUE"
        >         toolScripts.afterResume = "TRUE"
        >         toolScripts.beforeSuspend = "TRUE"
        >         toolScripts.beforePowerOff = "TRUE"
        >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
        >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
        >         firmware = "bios"
        >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
        >
        >
        >         6) *.vmsd file content:
        >
        >         cat
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
        >         .encoding = "UTF-8"
        >         snapshot.lastUID = "1"
        >         snapshot.current = "1"
        >         snapshot0.uid = "1"
        >         snapshot0.filename =
        > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
        >         snapshot0.displayName = "cloud.template.base"
        >         snapshot0.description = "Base snapshot"
        >         snapshot0.createTimeHigh = "363123"
        >         snapshot0.createTimeLow = "-679076964"
        >         snapshot0.numDisks = "1"
        >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        >         snapshot0.disk0.node = "scsi0:0"
        >         snapshot.numSnapshots = "1"
        >
        >         7) *-Snapshot1.vmsn content:
        >
        >         cat
        > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
        >
        >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
        >         config.version = "8"
        >         virtualHW.version = "8"
        >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
        >         pciBridge0.present = "TRUE"
        >         svga.present = "TRUE"
        >         pciBridge4.present = "TRUE"
        >         pciBridge4.virtualDev = "pcieRootPort"
        >         pciBridge4.functions = "8"
        >         pciBridge5.present = "TRUE"
        >         pciBridge5.virtualDev = "pcieRootPort"
        >         pciBridge5.functions = "8"
        >         pciBridge6.present = "TRUE"
        >         pciBridge6.virtualDev = "pcieRootPort"
        >         pciBridge6.functions = "8"
        >         pciBridge7.present = "TRUE"
        >         pciBridge7.virtualDev = "pcieRootPort"
        >         pciBridge7.functions = "8"
        >         vmci0.present = "TRUE"
        >         hpet0.present = "TRUE"
        >         floppy0.present = "FALSE"
        >         memSize = "256"
        >         scsi0.virtualDev = "lsilogic"
        >         scsi0.present = "TRUE"
        >         ide0:0.startConnected = "FALSE"
        >         ide0:0.deviceType = "atapi-cdrom"
        >         ide0:0.fileName = "CD/DVD drive 0"
        >         ide0:0.present = "TRUE"
        >         scsi0:0.deviceType = "scsi-hardDisk"
        >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        >         scsi0:0.present = "TRUE"
        >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
        >         annotation = "systemvmtemplate-4.11.2.0-vmware"
        >         guestOS = "otherlinux-64"
        >         toolScripts.afterPowerOn = "TRUE"
        >         toolScripts.afterResume = "TRUE"
        >         toolScripts.beforeSuspend = "TRUE"
        >         toolScripts.beforePowerOff = "TRUE"
        >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
        >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
        >         firmware = "bios"
        >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
        >
        >
        >         ------------
        >
        >         That's all the data on the template VMDK.
        >
        >         Much appreciate your time!
        >
        >         Yiping
        >
        >
        >
        >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
        > wrote:
        >
        >             Have you tried deleting template from PS and let ACS to recopy
        > it again? If the issue is reproducible we can try to look what is wrong
        > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
        > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
        > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
        > copying template). Also from one of your ESX hosts output of this
        >             ls -al /vmfs/volumes
        >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
        > equitant after ACS finishes copying template)
        >
        >              Can you also post management server log starting from the
        > point you unregister and delete template from the vCenter.
        >
        >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
        > wrote:
        >
        >                 I have manually imported the OVA to vCenter and
        > successfully cloned a VM instance with it, on the same NFS datastore.
        >
        >
        >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
        > serg38l@hotmail.com> wrote:
        >
        >                     I would suspect the template is corrupted on the
        > secondary storage. You can try disabling/enabling link clone feature and
        > see if it works the other way.
        >                     vmware.create.full.clone                    false
        >
        >                     Also systemVM template might have been generated on a
        > newer version of vSphere and not compatible with ESXi 6.5. What you can do
        > to validate this is to manually deploy OVA that is in Secondary storage and
        > try to spin up VM from it directly in vCenter.
        >
        >
        >
        >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
        > <yi...@adobe.com.INVALID> wrote:
        >
        >                         Hi, list:
        >
        >                         I am struggling with deploying a new advanced zone
        > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
        > storage devices. The initial setup of CS management server, seeding of
        > systemVM template, and advanced zone deployment all went smoothly.
        >
        >                         Once I enabled the zone in web UI and the systemVM
        > template gets copied/staged on to primary storage device. But subsequent VM
        > creations from this template would fail with errors:
        >
        >
        >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
        > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
        > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
        > cache
        >
        >                         2019-06-03 18:38:17,017 INFO
        > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
        > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
        > successfully reconfigured
        >
        >                         2019-06-03 18:38:17,128 INFO
        > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
        > clone from template
        >
        >                         2019-06-03 18:38:17,657 INFO
        > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
        > [ignored]failed toi get message for exception: Error caused by file
        > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        >
        >                         2019-06-03 18:38:17,658 ERROR
        > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
        > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
        > from base image failed due to Exception: java.lang.RuntimeException
        >
        >                         Message: Error caused by file
        > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        >
        >
        >
        >                         If I try to create “new VM from template”
        > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
        > exactly the same error message. The name of the VMDK file in the error
        > message is a snapshot of the base disk image, but it is not part of the
        > original template OVA on the secondary storage.  So, in the process of
        > copying the template from secondary to primary storage, a snapshot got
        > created and the disk became corrupted/unusable.
        >
        >                         Much later in the log file,  there is another
        > error message “failed to fetch any free public IP address” (for ssvm, I
        > think).  I don’t know if these two errors are related or if one is the root
        > cause for the other error.
        >
        >                         The full management server log is uploaded as
        > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C4c2a536c47ed44e4167908d6e98be65f%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953184420635130&amp;sdata=GFdI2M4NdVoB848FAtYgg9zzfXAjBFYm%2BwafqMTnxWY%3D&amp;reserved=0
        >
        >                         Any help or insight on what went wrong here are
        > much appreciated.
        >
        >                         Thanks
        >
        >                         Yiping
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        
        -- 
        
        Andrija Panić
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
Well,  I can always reproduce it in this particular vSphere set up,  but in a different ACS+vSphere environment,  I don't see this problem.

Yiping

On 6/5/19, 1:00 AM, "Andrija Panic" <an...@gmail.com> wrote:

    Yiping,
    
    if you are sure you can reproduce the issue, it would be good to raise a
    GitHub issue and provide as much detail as possible.
    
    Andrija
    
    On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
    wrote:
    
    > Hi, Sergey:
    >
    > Thanks for the tip. After setting vmware.create.full.clone=false,  I was
    > able to create and start system VM instances.    However,  I feel that the
    > underlying problem still exists, and I am just working around it instead of
    > fixing it,  because in my lab CloudStack instance with the same version of
    > ACS and vSphere,  I still have vmware.create.full.clone=true and all is
    > working as expected.
    >
    > I did some reading on VMware docs regarding full clone vs. linked clone.
    > It seems that the best practice is to use full clone for production,
    > especially if there are high rates of changes to the disks.  So
    > eventually,  I need to understand and fix the root cause for this issue.
    > At least for now,  I am over this hurdle and I can move on.
    >
    > Thanks again,
    >
    > Yiping
    >
    > On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    >
    >     Everything looks good and consistent including all references in VMDK
    > and its snapshot. I would try these 2 routes:
    >     1. Figure out what vSphere error actually means from vmkernel log of
    > ESX when ACS tries to clone the template. If the same error happens while
    > doing it outside of ACS then a support case with VMware can be an option
    >     2. Try using link clones. This can be done by this global setting and
    > restarting management server
    >     vmware.create.full.clone                    false
    >
    >
    >     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    >
    >         Hi, Sergey:
    >
    >         Thanks for the help. By now, I have dropped and recreated DB,
    > re-deployed this zone multiple times, blow away primary and secondary
    > storage (including all contents on them) , or just delete template itself
    > from primary storage, multiple times.  Every time I ended up with the same
    > error at the same place.
    >
    >         The full management server log,  from the point I seeded the
    > systemvmtemplate for vmware, to deploying a new advanced zone and enable
    > the zone to let CS to create system VM's and finally disable the zone to
    > stop infinite loop of trying to recreate failed system VM's,  are posted
    > at pastebin:
    >
    >
    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C4c2a536c47ed44e4167908d6e98be65f%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953184420635130&amp;sdata=GFdI2M4NdVoB848FAtYgg9zzfXAjBFYm%2BwafqMTnxWY%3D&amp;reserved=0
    >
    >         Here are the content of relevant files for the template on primary
    > storage:
    >
    >         1) /vmfsvolumes:
    >
    >         ls -l /vmfs/volumes/
    >         total 2052
    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
    > 414f6a73-87cd6dac-9585-133ddd409762
    >         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
    > 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
    > the NFS datastore for primary storage
    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
    > 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
    >         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
    > 5cd4b471-c2318b91-8fb2-00215a9b31c0
    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
    > 5cd4b471-da49a95b-bdb6-00215a9b31c0
    >         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
    > afc5e946-03bfe3c2
    >         drwxr-xr-x    1 root     root             8 Jan  1  1970
    > b70c377c-54a9d28a-6a7b-3f462a475f73
    >
    >         2) content in template dir on primary storage:
    >
    >         ls -l
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
    >         total 1154596
    >         -rw-------    1 root     root          8192 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
    >         -rw-------    1 root     root           366 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    >         -rw-r--r--    1 root     root           268 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
    >         -rw-------    1 root     root          9711 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
    >         -rw-------    1 root     root     2097152000 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
    >         -rw-------    1 root     root           518 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
    >         -rw-r--r--    1 root     root           471 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
    >         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
    > 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
    >
    >         3) *.vmdk file content:
    >
    >         cat
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
    >         # Disk DescriptorFile
    >         version=1
    >         encoding="UTF-8"
    >         CID=ecb01275
    >         parentCID=ffffffff
    >         isNativeSnapshot="no"
    >         createType="vmfs"
    >
    >         # Extent description
    >         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
    >
    >         # The Disk Data Base
    >         #DDB
    >
    >         ddb.adapterType = "lsilogic"
    >         ddb.geometry.cylinders = "4063"
    >         ddb.geometry.heads = "16"
    >         ddb.geometry.sectors = "63"
    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
    >         ddb.thinProvisioned = "1"
    >         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
    >         ddb.virtualHWVersion = "13"
    >
    >         4) *-0001.vmdk content:
    >
    >         cat
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    >
    >         # Disk DescriptorFile
    >         version=1
    >         encoding="UTF-8"
    >         CID=ecb01275
    >         parentCID=ecb01275
    >         isNativeSnapshot="no"
    >         createType="vmfsSparse"
    >         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    >         # Extent description
    >         RW 4096000 VMFSSPARSE
    > "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
    >
    >         # The Disk Data Base
    >         #DDB
    >
    >         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
    >
    >
    >         5) *.vmtx content:
    >
    >         cat
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
    >
    >         .encoding = "UTF-8"
    >         config.version = "8"
    >         virtualHW.version = "8"
    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
    >         pciBridge0.present = "TRUE"
    >         svga.present = "TRUE"
    >         pciBridge4.present = "TRUE"
    >         pciBridge4.virtualDev = "pcieRootPort"
    >         pciBridge4.functions = "8"
    >         pciBridge5.present = "TRUE"
    >         pciBridge5.virtualDev = "pcieRootPort"
    >         pciBridge5.functions = "8"
    >         pciBridge6.present = "TRUE"
    >         pciBridge6.virtualDev = "pcieRootPort"
    >         pciBridge6.functions = "8"
    >         pciBridge7.present = "TRUE"
    >         pciBridge7.virtualDev = "pcieRootPort"
    >         pciBridge7.functions = "8"
    >         vmci0.present = "TRUE"
    >         hpet0.present = "TRUE"
    >         floppy0.present = "FALSE"
    >         memSize = "256"
    >         scsi0.virtualDev = "lsilogic"
    >         scsi0.present = "TRUE"
    >         ide0:0.startConnected = "FALSE"
    >         ide0:0.deviceType = "atapi-cdrom"
    >         ide0:0.fileName = "CD/DVD drive 0"
    >         ide0:0.present = "TRUE"
    >         scsi0:0.deviceType = "scsi-hardDisk"
    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
    >         scsi0:0.present = "TRUE"
    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
    >         guestOS = "otherlinux-64"
    >         toolScripts.afterPowerOn = "TRUE"
    >         toolScripts.afterResume = "TRUE"
    >         toolScripts.beforeSuspend = "TRUE"
    >         toolScripts.beforePowerOff = "TRUE"
    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
    >         firmware = "bios"
    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
    >
    >
    >         6) *.vmsd file content:
    >
    >         cat
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
    >         .encoding = "UTF-8"
    >         snapshot.lastUID = "1"
    >         snapshot.current = "1"
    >         snapshot0.uid = "1"
    >         snapshot0.filename =
    > "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
    >         snapshot0.displayName = "cloud.template.base"
    >         snapshot0.description = "Base snapshot"
    >         snapshot0.createTimeHigh = "363123"
    >         snapshot0.createTimeLow = "-679076964"
    >         snapshot0.numDisks = "1"
    >         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    >         snapshot0.disk0.node = "scsi0:0"
    >         snapshot.numSnapshots = "1"
    >
    >         7) *-Snapshot1.vmsn content:
    >
    >         cat
    > /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
    >
    >         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
    >         config.version = "8"
    >         virtualHW.version = "8"
    >         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
    >         pciBridge0.present = "TRUE"
    >         svga.present = "TRUE"
    >         pciBridge4.present = "TRUE"
    >         pciBridge4.virtualDev = "pcieRootPort"
    >         pciBridge4.functions = "8"
    >         pciBridge5.present = "TRUE"
    >         pciBridge5.virtualDev = "pcieRootPort"
    >         pciBridge5.functions = "8"
    >         pciBridge6.present = "TRUE"
    >         pciBridge6.virtualDev = "pcieRootPort"
    >         pciBridge6.functions = "8"
    >         pciBridge7.present = "TRUE"
    >         pciBridge7.virtualDev = "pcieRootPort"
    >         pciBridge7.functions = "8"
    >         vmci0.present = "TRUE"
    >         hpet0.present = "TRUE"
    >         floppy0.present = "FALSE"
    >         memSize = "256"
    >         scsi0.virtualDev = "lsilogic"
    >         scsi0.present = "TRUE"
    >         ide0:0.startConnected = "FALSE"
    >         ide0:0.deviceType = "atapi-cdrom"
    >         ide0:0.fileName = "CD/DVD drive 0"
    >         ide0:0.present = "TRUE"
    >         scsi0:0.deviceType = "scsi-hardDisk"
    >         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    >         scsi0:0.present = "TRUE"
    >         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
    >         annotation = "systemvmtemplate-4.11.2.0-vmware"
    >         guestOS = "otherlinux-64"
    >         toolScripts.afterPowerOn = "TRUE"
    >         toolScripts.afterResume = "TRUE"
    >         toolScripts.beforeSuspend = "TRUE"
    >         toolScripts.beforePowerOff = "TRUE"
    >         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
    >         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
    >         firmware = "bios"
    >         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
    >
    >
    >         ------------
    >
    >         That's all the data on the template VMDK.
    >
    >         Much appreciate your time!
    >
    >         Yiping
    >
    >
    >
    >         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
    > wrote:
    >
    >             Have you tried deleting template from PS and let ACS to recopy
    > it again? If the issue is reproducible we can try to look what is wrong
    > with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
    > 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
    > 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
    > copying template). Also from one of your ESX hosts output of this
    >             ls -al /vmfs/volumes
    >             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
    > equitant after ACS finishes copying template)
    >
    >              Can you also post management server log starting from the
    > point you unregister and delete template from the vCenter.
    >
    >             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
    > wrote:
    >
    >                 I have manually imported the OVA to vCenter and
    > successfully cloned a VM instance with it, on the same NFS datastore.
    >
    >
    >                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
    > serg38l@hotmail.com> wrote:
    >
    >                     I would suspect the template is corrupted on the
    > secondary storage. You can try disabling/enabling link clone feature and
    > see if it works the other way.
    >                     vmware.create.full.clone                    false
    >
    >                     Also systemVM template might have been generated on a
    > newer version of vSphere and not compatible with ESXi 6.5. What you can do
    > to validate this is to manually deploy OVA that is in Secondary storage and
    > try to spin up VM from it directly in vCenter.
    >
    >
    >
    >                     On 6/3/19, 5:41 PM, "Yiping Zhang"
    > <yi...@adobe.com.INVALID> wrote:
    >
    >                         Hi, list:
    >
    >                         I am struggling with deploying a new advanced zone
    > using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
    > storage devices. The initial setup of CS management server, seeding of
    > systemVM template, and advanced zone deployment all went smoothly.
    >
    >                         Once I enabled the zone in web UI and the systemVM
    > template gets copied/staged on to primary storage device. But subsequent VM
    > creations from this template would fail with errors:
    >
    >
    >                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
    > (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
    > cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
    > cache
    >
    >                         2019-06-03 18:38:17,017 INFO
    > [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
    > VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
    > successfully reconfigured
    >
    >                         2019-06-03 18:38:17,128 INFO
    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
    > clone from template
    >
    >                         2019-06-03 18:38:17,657 INFO
    > [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
    > [ignored]failed toi get message for exception: Error caused by file
    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    >
    >                         2019-06-03 18:38:17,658 ERROR
    > [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
    > esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
    > from base image failed due to Exception: java.lang.RuntimeException
    >
    >                         Message: Error caused by file
    > /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    >
    >
    >
    >                         If I try to create “new VM from template”
    > (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
    > exactly the same error message. The name of the VMDK file in the error
    > message is a snapshot of the base disk image, but it is not part of the
    > original template OVA on the secondary storage.  So, in the process of
    > copying the template from secondary to primary storage, a snapshot got
    > created and the disk became corrupted/unusable.
    >
    >                         Much later in the log file,  there is another
    > error message “failed to fetch any free public IP address” (for ssvm, I
    > think).  I don’t know if these two errors are related or if one is the root
    > cause for the other error.
    >
    >                         The full management server log is uploaded as
    > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C4c2a536c47ed44e4167908d6e98be65f%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636953184420635130&amp;sdata=GFdI2M4NdVoB848FAtYgg9zzfXAjBFYm%2BwafqMTnxWY%3D&amp;reserved=0
    >
    >                         Any help or insight on what went wrong here are
    > much appreciated.
    >
    >                         Thanks
    >
    >                         Yiping
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    >
    
    -- 
    
    Andrija Panić
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Andrija Panic <an...@gmail.com>.
Yiping,

if you are sure you can reproduce the issue, it would be good to raise a
GitHub issue and provide as much detail as possible.

Andrija

On Wed, 5 Jun 2019 at 05:29, Yiping Zhang <yi...@adobe.com.invalid>
wrote:

> Hi, Sergey:
>
> Thanks for the tip. After setting vmware.create.full.clone=false,  I was
> able to create and start system VM instances.    However,  I feel that the
> underlying problem still exists, and I am just working around it instead of
> fixing it,  because in my lab CloudStack instance with the same version of
> ACS and vSphere,  I still have vmware.create.full.clone=true and all is
> working as expected.
>
> I did some reading on VMware docs regarding full clone vs. linked clone.
> It seems that the best practice is to use full clone for production,
> especially if there are high rates of changes to the disks.  So
> eventually,  I need to understand and fix the root cause for this issue.
> At least for now,  I am over this hurdle and I can move on.
>
> Thanks again,
>
> Yiping
>
> On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
>
>     Everything looks good and consistent including all references in VMDK
> and its snapshot. I would try these 2 routes:
>     1. Figure out what vSphere error actually means from vmkernel log of
> ESX when ACS tries to clone the template. If the same error happens while
> doing it outside of ACS then a support case with VMware can be an option
>     2. Try using link clones. This can be done by this global setting and
> restarting management server
>     vmware.create.full.clone                    false
>
>
>     On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
>
>         Hi, Sergey:
>
>         Thanks for the help. By now, I have dropped and recreated DB,
> re-deployed this zone multiple times, blow away primary and secondary
> storage (including all contents on them) , or just delete template itself
> from primary storage, multiple times.  Every time I ended up with the same
> error at the same place.
>
>         The full management server log,  from the point I seeded the
> systemvmtemplate for vmware, to deploying a new advanced zone and enable
> the zone to let CS to create system VM's and finally disable the zone to
> stop infinite loop of trying to recreate failed system VM's,  are posted
> at pastebin:
>
>
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7Cefd6c161a79144ab04da08d6e91868da%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952688385027720&amp;sdata=GmOE6q6APW3SfTLKItw0M2BRnynbuZuuevo1Ly%2F6CnQ%3D&amp;reserved=0
>
>         Here are the content of relevant files for the template on primary
> storage:
>
>         1) /vmfsvolumes:
>
>         ls -l /vmfs/volumes/
>         total 2052
>         drwxr-xr-x    1 root     root             8 Jan  1  1970
> 414f6a73-87cd6dac-9585-133ddd409762
>         lrwxr-xr-x    1 root     root            17 Jun  4 16:37
> 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is
> the NFS datastore for primary storage
>         drwxr-xr-x    1 root     root             8 Jan  1  1970
> 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
>         drwxr-xr-t    1 root     root          1400 Jun  3 22:50
> 5cd4b471-c2318b91-8fb2-00215a9b31c0
>         drwxr-xr-x    1 root     root             8 Jan  1  1970
> 5cd4b471-da49a95b-bdb6-00215a9b31c0
>         drwxr-xr-x    4 root     root          4096 Jun  3 23:38
> afc5e946-03bfe3c2
>         drwxr-xr-x    1 root     root             8 Jan  1  1970
> b70c377c-54a9d28a-6a7b-3f462a475f73
>
>         2) content in template dir on primary storage:
>
>         ls -l
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
>         total 1154596
>         -rw-------    1 root     root          8192 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
>         -rw-------    1 root     root           366 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
>         -rw-r--r--    1 root     root           268 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
>         -rw-------    1 root     root          9711 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
>         -rw-------    1 root     root     2097152000 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
>         -rw-------    1 root     root           518 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
>         -rw-r--r--    1 root     root           471 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
>         -rwxr-xr-x    1 root     root          1402 Jun  3 23:38
> 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
>
>         3) *.vmdk file content:
>
>         cat
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
>         # Disk DescriptorFile
>         version=1
>         encoding="UTF-8"
>         CID=ecb01275
>         parentCID=ffffffff
>         isNativeSnapshot="no"
>         createType="vmfs"
>
>         # Extent description
>         RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
>
>         # The Disk Data Base
>         #DDB
>
>         ddb.adapterType = "lsilogic"
>         ddb.geometry.cylinders = "4063"
>         ddb.geometry.heads = "16"
>         ddb.geometry.sectors = "63"
>         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
>         ddb.thinProvisioned = "1"
>         ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
>         ddb.virtualHWVersion = "13"
>
>         4) *-0001.vmdk content:
>
>         cat
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
>
>         # Disk DescriptorFile
>         version=1
>         encoding="UTF-8"
>         CID=ecb01275
>         parentCID=ecb01275
>         isNativeSnapshot="no"
>         createType="vmfsSparse"
>         parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
>         # Extent description
>         RW 4096000 VMFSSPARSE
> "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
>
>         # The Disk Data Base
>         #DDB
>
>         ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
>
>
>         5) *.vmtx content:
>
>         cat
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx
>
>         .encoding = "UTF-8"
>         config.version = "8"
>         virtualHW.version = "8"
>         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
>         pciBridge0.present = "TRUE"
>         svga.present = "TRUE"
>         pciBridge4.present = "TRUE"
>         pciBridge4.virtualDev = "pcieRootPort"
>         pciBridge4.functions = "8"
>         pciBridge5.present = "TRUE"
>         pciBridge5.virtualDev = "pcieRootPort"
>         pciBridge5.functions = "8"
>         pciBridge6.present = "TRUE"
>         pciBridge6.virtualDev = "pcieRootPort"
>         pciBridge6.functions = "8"
>         pciBridge7.present = "TRUE"
>         pciBridge7.virtualDev = "pcieRootPort"
>         pciBridge7.functions = "8"
>         vmci0.present = "TRUE"
>         hpet0.present = "TRUE"
>         floppy0.present = "FALSE"
>         memSize = "256"
>         scsi0.virtualDev = "lsilogic"
>         scsi0.present = "TRUE"
>         ide0:0.startConnected = "FALSE"
>         ide0:0.deviceType = "atapi-cdrom"
>         ide0:0.fileName = "CD/DVD drive 0"
>         ide0:0.present = "TRUE"
>         scsi0:0.deviceType = "scsi-hardDisk"
>         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
>         scsi0:0.present = "TRUE"
>         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
>         annotation = "systemvmtemplate-4.11.2.0-vmware"
>         guestOS = "otherlinux-64"
>         toolScripts.afterPowerOn = "TRUE"
>         toolScripts.afterResume = "TRUE"
>         toolScripts.beforeSuspend = "TRUE"
>         toolScripts.beforePowerOff = "TRUE"
>         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
>         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
>         firmware = "bios"
>         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
>
>
>         6) *.vmsd file content:
>
>         cat
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
>         .encoding = "UTF-8"
>         snapshot.lastUID = "1"
>         snapshot.current = "1"
>         snapshot0.uid = "1"
>         snapshot0.filename =
> "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
>         snapshot0.displayName = "cloud.template.base"
>         snapshot0.description = "Base snapshot"
>         snapshot0.createTimeHigh = "363123"
>         snapshot0.createTimeLow = "-679076964"
>         snapshot0.numDisks = "1"
>         snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
>         snapshot0.disk0.node = "scsi0:0"
>         snapshot.numSnapshots = "1"
>
>         7) *-Snapshot1.vmsn content:
>
>         cat
> /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
>
>         ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
>         config.version = "8"
>         virtualHW.version = "8"
>         nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
>         pciBridge0.present = "TRUE"
>         svga.present = "TRUE"
>         pciBridge4.present = "TRUE"
>         pciBridge4.virtualDev = "pcieRootPort"
>         pciBridge4.functions = "8"
>         pciBridge5.present = "TRUE"
>         pciBridge5.virtualDev = "pcieRootPort"
>         pciBridge5.functions = "8"
>         pciBridge6.present = "TRUE"
>         pciBridge6.virtualDev = "pcieRootPort"
>         pciBridge6.functions = "8"
>         pciBridge7.present = "TRUE"
>         pciBridge7.virtualDev = "pcieRootPort"
>         pciBridge7.functions = "8"
>         vmci0.present = "TRUE"
>         hpet0.present = "TRUE"
>         floppy0.present = "FALSE"
>         memSize = "256"
>         scsi0.virtualDev = "lsilogic"
>         scsi0.present = "TRUE"
>         ide0:0.startConnected = "FALSE"
>         ide0:0.deviceType = "atapi-cdrom"
>         ide0:0.fileName = "CD/DVD drive 0"
>         ide0:0.present = "TRUE"
>         scsi0:0.deviceType = "scsi-hardDisk"
>         scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
>         scsi0:0.present = "TRUE"
>         displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
>         annotation = "systemvmtemplate-4.11.2.0-vmware"
>         guestOS = "otherlinux-64"
>         toolScripts.afterPowerOn = "TRUE"
>         toolScripts.afterResume = "TRUE"
>         toolScripts.beforeSuspend = "TRUE"
>         toolScripts.beforePowerOff = "TRUE"
>         uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
>         vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
>         firmware = "bios"
>         migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
>
>
>         ------------
>
>         That's all the data on the template VMDK.
>
>         Much appreciate your time!
>
>         Yiping
>
>
>
>         On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com>
> wrote:
>
>             Have you tried deleting template from PS and let ACS to recopy
> it again? If the issue is reproducible we can try to look what is wrong
> with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk ,
> 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and
> 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes
> copying template). Also from one of your ESX hosts output of this
>             ls -al /vmfs/volumes
>             ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their
> equitant after ACS finishes copying template)
>
>              Can you also post management server log starting from the
> point you unregister and delete template from the vCenter.
>
>             On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID>
> wrote:
>
>                 I have manually imported the OVA to vCenter and
> successfully cloned a VM instance with it, on the same NFS datastore.
>
>
>                 On 6/4/19, 8:25 AM, "Sergey Levitskiy" <
> serg38l@hotmail.com> wrote:
>
>                     I would suspect the template is corrupted on the
> secondary storage. You can try disabling/enabling link clone feature and
> see if it works the other way.
>                     vmware.create.full.clone                    false
>
>                     Also systemVM template might have been generated on a
> newer version of vSphere and not compatible with ESXi 6.5. What you can do
> to validate this is to manually deploy OVA that is in Secondary storage and
> try to spin up VM from it directly in vCenter.
>
>
>
>                     On 6/3/19, 5:41 PM, "Yiping Zhang"
> <yi...@adobe.com.INVALID> wrote:
>
>                         Hi, list:
>
>                         I am struggling with deploying a new advanced zone
> using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary
> storage devices. The initial setup of CS management server, seeding of
> systemVM template, and advanced zone deployment all went smoothly.
>
>                         Once I enabled the zone in web UI and the systemVM
> template gets copied/staged on to primary storage device. But subsequent VM
> creations from this template would fail with errors:
>
>
>                         2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO]
> (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29,
> cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host
> cache
>
>                         2019-06-03 18:38:17,017 INFO
> [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd
> esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
> VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler
> successfully reconfigured
>
>                         2019-06-03 18:38:17,128 INFO
> [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
> esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full
> clone from template
>
>                         2019-06-03 18:38:17,657 INFO
> [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd
> esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand)
> [ignored]failed toi get message for exception: Error caused by file
> /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
>
>                         2019-06-03 18:38:17,658 ERROR
> [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd
> esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume
> from base image failed due to Exception: java.lang.RuntimeException
>
>                         Message: Error caused by file
> /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
>
>
>
>                         If I try to create “new VM from template”
> (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive
> exactly the same error message. The name of the VMDK file in the error
> message is a snapshot of the base disk image, but it is not part of the
> original template OVA on the secondary storage.  So, in the process of
> copying the template from secondary to primary storage, a snapshot got
> created and the disk became corrupted/unusable.
>
>                         Much later in the log file,  there is another
> error message “failed to fetch any free public IP address” (for ssvm, I
> think).  I don’t know if these two errors are related or if one is the root
> cause for the other error.
>
>                         The full management server log is uploaded as
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7Cefd6c161a79144ab04da08d6e91868da%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952688385027720&amp;sdata=GmOE6q6APW3SfTLKItw0M2BRnynbuZuuevo1Ly%2F6CnQ%3D&amp;reserved=0
>
>                         Any help or insight on what went wrong here are
> much appreciated.
>
>                         Thanks
>
>                         Yiping
>
>
>
>
>
>
>
>
>
>
>
>
>

-- 

Andrija Panić

Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
Hi, Sergey:

Thanks for the tip. After setting vmware.create.full.clone=false,  I was able to create and start system VM instances.    However,  I feel that the underlying problem still exists, and I am just working around it instead of fixing it,  because in my lab CloudStack instance with the same version of ACS and vSphere,  I still have vmware.create.full.clone=true and all is working as expected.

I did some reading on VMware docs regarding full clone vs. linked clone.  It seems that the best practice is to use full clone for production, especially if there are high rates of changes to the disks.  So eventually,  I need to understand and fix the root cause for this issue.  At least for now,  I am over this hurdle and I can move on.

Thanks again,

Yiping

On 6/4/19, 11:13 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    Everything looks good and consistent including all references in VMDK  and its snapshot. I would try these 2 routes:
    1. Figure out what vSphere error actually means from vmkernel log of ESX when ACS tries to clone the template. If the same error happens while doing it outside of ACS then a support case with VMware can be an option
    2. Try using link clones. This can be done by this global setting and restarting management server
    vmware.create.full.clone 			false
    
    
    On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        Hi, Sergey:
        
        Thanks for the help. By now, I have dropped and recreated DB, re-deployed this zone multiple times, blow away primary and secondary storage (including all contents on them) , or just delete template itself from primary storage, multiple times.  Every time I ended up with the same error at the same place.
        
        The full management server log,  from the point I seeded the systemvmtemplate for vmware, to deploying a new advanced zone and enable the zone to let CS to create system VM's and finally disable the zone to stop infinite loop of trying to recreate failed system VM's,  are posted  at pastebin:
        
        https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7Cefd6c161a79144ab04da08d6e91868da%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952688385027720&amp;sdata=GmOE6q6APW3SfTLKItw0M2BRnynbuZuuevo1Ly%2F6CnQ%3D&amp;reserved=0
        
        Here are the content of relevant files for the template on primary storage:
        
        1) /vmfsvolumes:
        
        ls -l /vmfs/volumes/
        total 2052
        drwxr-xr-x    1 root     root             8 Jan  1  1970 414f6a73-87cd6dac-9585-133ddd409762
        lrwxr-xr-x    1 root     root            17 Jun  4 16:37 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is the NFS datastore for primary storage
        drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
        drwxr-xr-t    1 root     root          1400 Jun  3 22:50 5cd4b471-c2318b91-8fb2-00215a9b31c0
        drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b471-da49a95b-bdb6-00215a9b31c0
        drwxr-xr-x    4 root     root          4096 Jun  3 23:38 afc5e946-03bfe3c2
        drwxr-xr-x    1 root     root             8 Jan  1  1970 b70c377c-54a9d28a-6a7b-3f462a475f73
        
        2) content in template dir on primary storage:
        
        ls -l /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
        total 1154596
        -rw-------    1 root     root          8192 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
        -rw-------    1 root     root           366 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        -rw-r--r--    1 root     root           268 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
        -rw-------    1 root     root          9711 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
        -rw-------    1 root     root     2097152000 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
        -rw-------    1 root     root           518 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
        -rw-r--r--    1 root     root           471 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
        -rwxr-xr-x    1 root     root          1402 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
        
        3) *.vmdk file content:
        
        cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
        # Disk DescriptorFile
        version=1
        encoding="UTF-8"
        CID=ecb01275
        parentCID=ffffffff
        isNativeSnapshot="no"
        createType="vmfs"
        
        # Extent description
        RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
        
        # The Disk Data Base 
        #DDB
        
        ddb.adapterType = "lsilogic"
        ddb.geometry.cylinders = "4063"
        ddb.geometry.heads = "16"
        ddb.geometry.sectors = "63"
        ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
        ddb.thinProvisioned = "1"
        ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
        ddb.virtualHWVersion = "13"
        
        4) *-0001.vmdk content:
        
        cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk 
        # Disk DescriptorFile
        version=1
        encoding="UTF-8"
        CID=ecb01275
        parentCID=ecb01275
        isNativeSnapshot="no"
        createType="vmfsSparse"
        parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        # Extent description
        RW 4096000 VMFSSPARSE "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
        
        # The Disk Data Base 
        #DDB
        
        ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
        
        
        5) *.vmtx content:
        
        cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx 
        .encoding = "UTF-8"
        config.version = "8"
        virtualHW.version = "8"
        nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
        pciBridge0.present = "TRUE"
        svga.present = "TRUE"
        pciBridge4.present = "TRUE"
        pciBridge4.virtualDev = "pcieRootPort"
        pciBridge4.functions = "8"
        pciBridge5.present = "TRUE"
        pciBridge5.virtualDev = "pcieRootPort"
        pciBridge5.functions = "8"
        pciBridge6.present = "TRUE"
        pciBridge6.virtualDev = "pcieRootPort"
        pciBridge6.functions = "8"
        pciBridge7.present = "TRUE"
        pciBridge7.virtualDev = "pcieRootPort"
        pciBridge7.functions = "8"
        vmci0.present = "TRUE"
        hpet0.present = "TRUE"
        floppy0.present = "FALSE"
        memSize = "256"
        scsi0.virtualDev = "lsilogic"
        scsi0.present = "TRUE"
        ide0:0.startConnected = "FALSE"
        ide0:0.deviceType = "atapi-cdrom"
        ide0:0.fileName = "CD/DVD drive 0"
        ide0:0.present = "TRUE"
        scsi0:0.deviceType = "scsi-hardDisk"
        scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
        scsi0:0.present = "TRUE"
        displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
        annotation = "systemvmtemplate-4.11.2.0-vmware"
        guestOS = "otherlinux-64"
        toolScripts.afterPowerOn = "TRUE"
        toolScripts.afterResume = "TRUE"
        toolScripts.beforeSuspend = "TRUE"
        toolScripts.beforePowerOff = "TRUE"
        uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
        vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
        firmware = "bios"
        migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
        
        
        6) *.vmsd file content:
        
        cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
        .encoding = "UTF-8"
        snapshot.lastUID = "1"
        snapshot.current = "1"
        snapshot0.uid = "1"
        snapshot0.filename = "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
        snapshot0.displayName = "cloud.template.base"
        snapshot0.description = "Base snapshot"
        snapshot0.createTimeHigh = "363123"
        snapshot0.createTimeLow = "-679076964"
        snapshot0.numDisks = "1"
        snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        snapshot0.disk0.node = "scsi0:0"
        snapshot.numSnapshots = "1"
        
        7) *-Snapshot1.vmsn content:
        
        cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn 
        ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
        config.version = "8"
        virtualHW.version = "8"
        nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
        pciBridge0.present = "TRUE"
        svga.present = "TRUE"
        pciBridge4.present = "TRUE"
        pciBridge4.virtualDev = "pcieRootPort"
        pciBridge4.functions = "8"
        pciBridge5.present = "TRUE"
        pciBridge5.virtualDev = "pcieRootPort"
        pciBridge5.functions = "8"
        pciBridge6.present = "TRUE"
        pciBridge6.virtualDev = "pcieRootPort"
        pciBridge6.functions = "8"
        pciBridge7.present = "TRUE"
        pciBridge7.virtualDev = "pcieRootPort"
        pciBridge7.functions = "8"
        vmci0.present = "TRUE"
        hpet0.present = "TRUE"
        floppy0.present = "FALSE"
        memSize = "256"
        scsi0.virtualDev = "lsilogic"
        scsi0.present = "TRUE"
        ide0:0.startConnected = "FALSE"
        ide0:0.deviceType = "atapi-cdrom"
        ide0:0.fileName = "CD/DVD drive 0"
        ide0:0.present = "TRUE"
        scsi0:0.deviceType = "scsi-hardDisk"
        scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
        scsi0:0.present = "TRUE"
        displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
        annotation = "systemvmtemplate-4.11.2.0-vmware"
        guestOS = "otherlinux-64"
        toolScripts.afterPowerOn = "TRUE"
        toolScripts.afterResume = "TRUE"
        toolScripts.beforeSuspend = "TRUE"
        toolScripts.beforePowerOff = "TRUE"
        uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
        vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
        firmware = "bios"
        migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
        
        
        ------------
        
        That's all the data on the template VMDK.
        
        Much appreciate your time!
        
        Yiping
        
        
        
        On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        
            Have you tried deleting template from PS and let ACS to recopy it again? If the issue is reproducible we can try to look what is wrong with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk , 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes copying template). Also from one of your ESX hosts output of this
            ls -al /vmfs/volumes
            ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their equitant after ACS finishes copying template)
            
             Can you also post management server log starting from the point you unregister and delete template from the vCenter.
            
            On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            
                I have manually imported the OVA to vCenter and successfully cloned a VM instance with it, on the same NFS datastore.
                
                
                On 6/4/19, 8:25 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
                
                    I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
                    vmware.create.full.clone 			false
                    
                    Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.
                    
                    
                    
                    On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                    
                        Hi, list:
                        
                        I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
                        
                        Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
                        
                        
                        2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
                        
                        2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
                        
                        2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
                        
                        2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        
                        2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
                        
                        Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                        
                        
                        
                        If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
                        
                        Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
                        
                        The full management server log is uploaded as https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7Cefd6c161a79144ab04da08d6e91868da%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952688385027720&amp;sdata=GmOE6q6APW3SfTLKItw0M2BRnynbuZuuevo1Ly%2F6CnQ%3D&amp;reserved=0
                        
                        Any help or insight on what went wrong here are much appreciated.
                        
                        Thanks
                        
                        Yiping
                        
                    
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
Everything looks good and consistent including all references in VMDK  and its snapshot. I would try these 2 routes:
1. Figure out what vSphere error actually means from vmkernel log of ESX when ACS tries to clone the template. If the same error happens while doing it outside of ACS then a support case with VMware can be an option
2. Try using link clones. This can be done by this global setting and restarting management server
vmware.create.full.clone 			false


On 6/4/19, 9:57 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    Hi, Sergey:
    
    Thanks for the help. By now, I have dropped and recreated DB, re-deployed this zone multiple times, blow away primary and secondary storage (including all contents on them) , or just delete template itself from primary storage, multiple times.  Every time I ended up with the same error at the same place.
    
    The full management server log,  from the point I seeded the systemvmtemplate for vmware, to deploying a new advanced zone and enable the zone to let CS to create system VM's and finally disable the zone to stop infinite loop of trying to recreate failed system VM's,  are posted  at pastebin:
    
    https://pastebin.com/c05wiQ3R
    
    Here are the content of relevant files for the template on primary storage:
    
    1) /vmfsvolumes:
    
    ls -l /vmfs/volumes/
    total 2052
    drwxr-xr-x    1 root     root             8 Jan  1  1970 414f6a73-87cd6dac-9585-133ddd409762
    lrwxr-xr-x    1 root     root            17 Jun  4 16:37 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is the NFS datastore for primary storage
    drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
    drwxr-xr-t    1 root     root          1400 Jun  3 22:50 5cd4b471-c2318b91-8fb2-00215a9b31c0
    drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b471-da49a95b-bdb6-00215a9b31c0
    drwxr-xr-x    4 root     root          4096 Jun  3 23:38 afc5e946-03bfe3c2
    drwxr-xr-x    1 root     root             8 Jan  1  1970 b70c377c-54a9d28a-6a7b-3f462a475f73
    
    2) content in template dir on primary storage:
    
    ls -l /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
    total 1154596
    -rw-------    1 root     root          8192 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
    -rw-------    1 root     root           366 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    -rw-r--r--    1 root     root           268 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
    -rw-------    1 root     root          9711 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
    -rw-------    1 root     root     2097152000 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
    -rw-------    1 root     root           518 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
    -rw-r--r--    1 root     root           471 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
    -rwxr-xr-x    1 root     root          1402 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmtx
    
    3) *.vmdk file content:
    
    cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
    # Disk DescriptorFile
    version=1
    encoding="UTF-8"
    CID=ecb01275
    parentCID=ffffffff
    isNativeSnapshot="no"
    createType="vmfs"
    
    # Extent description
    RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"
    
    # The Disk Data Base 
    #DDB
    
    ddb.adapterType = "lsilogic"
    ddb.geometry.cylinders = "4063"
    ddb.geometry.heads = "16"
    ddb.geometry.sectors = "63"
    ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
    ddb.thinProvisioned = "1"
    ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
    ddb.virtualHWVersion = "13"
    
    4) *-0001.vmdk content:
    
    cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk 
    # Disk DescriptorFile
    version=1
    encoding="UTF-8"
    CID=ecb01275
    parentCID=ecb01275
    isNativeSnapshot="no"
    createType="vmfsSparse"
    parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    # Extent description
    RW 4096000 VMFSSPARSE "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"
    
    # The Disk Data Base 
    #DDB
    
    ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
    
    
    5) *.vmtx content:
    
    cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx 
    .encoding = "UTF-8"
    config.version = "8"
    virtualHW.version = "8"
    nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
    pciBridge0.present = "TRUE"
    svga.present = "TRUE"
    pciBridge4.present = "TRUE"
    pciBridge4.virtualDev = "pcieRootPort"
    pciBridge4.functions = "8"
    pciBridge5.present = "TRUE"
    pciBridge5.virtualDev = "pcieRootPort"
    pciBridge5.functions = "8"
    pciBridge6.present = "TRUE"
    pciBridge6.virtualDev = "pcieRootPort"
    pciBridge6.functions = "8"
    pciBridge7.present = "TRUE"
    pciBridge7.virtualDev = "pcieRootPort"
    pciBridge7.functions = "8"
    vmci0.present = "TRUE"
    hpet0.present = "TRUE"
    floppy0.present = "FALSE"
    memSize = "256"
    scsi0.virtualDev = "lsilogic"
    scsi0.present = "TRUE"
    ide0:0.startConnected = "FALSE"
    ide0:0.deviceType = "atapi-cdrom"
    ide0:0.fileName = "CD/DVD drive 0"
    ide0:0.present = "TRUE"
    scsi0:0.deviceType = "scsi-hardDisk"
    scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
    scsi0:0.present = "TRUE"
    displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
    annotation = "systemvmtemplate-4.11.2.0-vmware"
    guestOS = "otherlinux-64"
    toolScripts.afterPowerOn = "TRUE"
    toolScripts.afterResume = "TRUE"
    toolScripts.beforeSuspend = "TRUE"
    toolScripts.beforePowerOff = "TRUE"
    uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
    vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
    firmware = "bios"
    migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
    
    
    6) *.vmsd file content:
    
    cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
    .encoding = "UTF-8"
    snapshot.lastUID = "1"
    snapshot.current = "1"
    snapshot0.uid = "1"
    snapshot0.filename = "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
    snapshot0.displayName = "cloud.template.base"
    snapshot0.description = "Base snapshot"
    snapshot0.createTimeHigh = "363123"
    snapshot0.createTimeLow = "-679076964"
    snapshot0.numDisks = "1"
    snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    snapshot0.disk0.node = "scsi0:0"
    snapshot.numSnapshots = "1"
    
    7) *-Snapshot1.vmsn content:
    
    cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn 
    ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
    config.version = "8"
    virtualHW.version = "8"
    nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
    pciBridge0.present = "TRUE"
    svga.present = "TRUE"
    pciBridge4.present = "TRUE"
    pciBridge4.virtualDev = "pcieRootPort"
    pciBridge4.functions = "8"
    pciBridge5.present = "TRUE"
    pciBridge5.virtualDev = "pcieRootPort"
    pciBridge5.functions = "8"
    pciBridge6.present = "TRUE"
    pciBridge6.virtualDev = "pcieRootPort"
    pciBridge6.functions = "8"
    pciBridge7.present = "TRUE"
    pciBridge7.virtualDev = "pcieRootPort"
    pciBridge7.functions = "8"
    vmci0.present = "TRUE"
    hpet0.present = "TRUE"
    floppy0.present = "FALSE"
    memSize = "256"
    scsi0.virtualDev = "lsilogic"
    scsi0.present = "TRUE"
    ide0:0.startConnected = "FALSE"
    ide0:0.deviceType = "atapi-cdrom"
    ide0:0.fileName = "CD/DVD drive 0"
    ide0:0.present = "TRUE"
    scsi0:0.deviceType = "scsi-hardDisk"
    scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
    scsi0:0.present = "TRUE"
    displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
    annotation = "systemvmtemplate-4.11.2.0-vmware"
    guestOS = "otherlinux-64"
    toolScripts.afterPowerOn = "TRUE"
    toolScripts.afterResume = "TRUE"
    toolScripts.beforeSuspend = "TRUE"
    toolScripts.beforePowerOff = "TRUE"
    uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
    vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
    firmware = "bios"
    migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"
    
    
    ------------
    
    That's all the data on the template VMDK.
    
    Much appreciate your time!
    
    Yiping
    
    
    
    On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    
        Have you tried deleting template from PS and let ACS to recopy it again? If the issue is reproducible we can try to look what is wrong with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk , 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes copying template). Also from one of your ESX hosts output of this
        ls -al /vmfs/volumes
        ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their equitant after ACS finishes copying template)
        
         Can you also post management server log starting from the point you unregister and delete template from the vCenter.
        
        On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
        
            I have manually imported the OVA to vCenter and successfully cloned a VM instance with it, on the same NFS datastore.
            
            
            On 6/4/19, 8:25 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
            
                I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
                vmware.create.full.clone 			false
                
                Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.
                
                
                
                On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
                
                    Hi, list:
                    
                    I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
                    
                    Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
                    
                    
                    2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
                    
                    2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
                    
                    2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
                    
                    2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    
                    2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
                    
                    Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                    
                    
                    
                    If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
                    
                    Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
                    
                    The full management server log is uploaded as https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C43145a4c1efe49f88f1708d6e909cfb1%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952625685144436&amp;sdata=N0ySSBNguim7OoX2i1PpD2cLlLLn2HJcL8wVKSywAH0%3D&amp;reserved=0
                    
                    Any help or insight on what went wrong here are much appreciated.
                    
                    Thanks
                    
                    Yiping
                    
                
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
Hi, Sergey:

Thanks for the help. By now, I have dropped and recreated DB, re-deployed this zone multiple times, blow away primary and secondary storage (including all contents on them) , or just delete template itself from primary storage, multiple times.  Every time I ended up with the same error at the same place.

The full management server log,  from the point I seeded the systemvmtemplate for vmware, to deploying a new advanced zone and enable the zone to let CS to create system VM's and finally disable the zone to stop infinite loop of trying to recreate failed system VM's,  are posted  at pastebin:

https://pastebin.com/c05wiQ3R

Here are the content of relevant files for the template on primary storage:

1) /vmfsvolumes:

ls -l /vmfs/volumes/
total 2052
drwxr-xr-x    1 root     root             8 Jan  1  1970 414f6a73-87cd6dac-9585-133ddd409762
lrwxr-xr-x    1 root     root            17 Jun  4 16:37 42054b8459633172be231d72a52d59d4 -> afc5e946-03bfe3c2          <== this is the NFS datastore for primary storage
drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b46b-fa4fcff0-d2a1-00215a9b31c0
drwxr-xr-t    1 root     root          1400 Jun  3 22:50 5cd4b471-c2318b91-8fb2-00215a9b31c0
drwxr-xr-x    1 root     root             8 Jan  1  1970 5cd4b471-da49a95b-bdb6-00215a9b31c0
drwxr-xr-x    4 root     root          4096 Jun  3 23:38 afc5e946-03bfe3c2
drwxr-xr-x    1 root     root             8 Jan  1  1970 b70c377c-54a9d28a-6a7b-3f462a475f73

2) content in template dir on primary storage:

ls -l /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/
total 1154596
-rw-------    1 root     root          8192 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk
-rw-------    1 root     root           366 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
-rw-r--r--    1 root     root           268 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog
-rw-------    1 root     root          9711 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn
-rw-------    1 root     root     2097152000 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk
-rw-------    1 root     root           518 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmdk
-rw-r--r--    1 root     root           471 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmsd
-rwxr-xr-x    1 root     root          1402 Jun  3 23:38 533b6fcf3fa6301aadcc2b168f3f999a.vmtx

3) *.vmdk file content:

cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmdk
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=ecb01275
parentCID=ffffffff
isNativeSnapshot="no"
createType="vmfs"

# Extent description
RW 4096000 VMFS "533b6fcf3fa6301aadcc2b168f3f999a-flat.vmdk"

# The Disk Data Base 
#DDB

ddb.adapterType = "lsilogic"
ddb.geometry.cylinders = "4063"
ddb.geometry.heads = "16"
ddb.geometry.sectors = "63"
ddb.longContentID = "1c60ba48999abde959998f05ecb01275"
ddb.thinProvisioned = "1"
ddb.uuid = "60 00 C2 9b 52 6d 98 c4-1f 44 51 ce 1e 70 a9 70"
ddb.virtualHWVersion = "13"

4) *-0001.vmdk content:

cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk 
# Disk DescriptorFile
version=1
encoding="UTF-8"
CID=ecb01275
parentCID=ecb01275
isNativeSnapshot="no"
createType="vmfsSparse"
parentFileNameHint="533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
# Extent description
RW 4096000 VMFSSPARSE "533b6fcf3fa6301aadcc2b168f3f999a-000001-delta.vmdk"

# The Disk Data Base 
#DDB

ddb.longContentID = "1c60ba48999abde959998f05ecb01275"


5) *.vmtx content:

cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmtx 
.encoding = "UTF-8"
config.version = "8"
virtualHW.version = "8"
nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
pciBridge0.present = "TRUE"
svga.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
hpet0.present = "TRUE"
floppy0.present = "FALSE"
memSize = "256"
scsi0.virtualDev = "lsilogic"
scsi0.present = "TRUE"
ide0:0.startConnected = "FALSE"
ide0:0.deviceType = "atapi-cdrom"
ide0:0.fileName = "CD/DVD drive 0"
ide0:0.present = "TRUE"
scsi0:0.deviceType = "scsi-hardDisk"
scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk"
scsi0:0.present = "TRUE"
displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
annotation = "systemvmtemplate-4.11.2.0-vmware"
guestOS = "otherlinux-64"
toolScripts.afterPowerOn = "TRUE"
toolScripts.afterResume = "TRUE"
toolScripts.beforeSuspend = "TRUE"
toolScripts.beforePowerOff = "TRUE"
uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
firmware = "bios"
migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"


6) *.vmsd file content:

cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a.vmsd
.encoding = "UTF-8"
snapshot.lastUID = "1"
snapshot.current = "1"
snapshot0.uid = "1"
snapshot0.filename = "533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn"
snapshot0.displayName = "cloud.template.base"
snapshot0.description = "Base snapshot"
snapshot0.createTimeHigh = "363123"
snapshot0.createTimeLow = "-679076964"
snapshot0.numDisks = "1"
snapshot0.disk0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
snapshot0.disk0.node = "scsi0:0"
snapshot.numSnapshots = "1"

7) *-Snapshot1.vmsn content:

cat /vmfs/volumes/42054b8459633172be231d72a52d59d4/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-Snapshot1.vmsn 
ҾSnapshot\?%?cfgFilet%t%.encoding = "UTF-8"
config.version = "8"
virtualHW.version = "8"
nvram = "533b6fcf3fa6301aadcc2b168f3f999a.nvram"
pciBridge0.present = "TRUE"
svga.present = "TRUE"
pciBridge4.present = "TRUE"
pciBridge4.virtualDev = "pcieRootPort"
pciBridge4.functions = "8"
pciBridge5.present = "TRUE"
pciBridge5.virtualDev = "pcieRootPort"
pciBridge5.functions = "8"
pciBridge6.present = "TRUE"
pciBridge6.virtualDev = "pcieRootPort"
pciBridge6.functions = "8"
pciBridge7.present = "TRUE"
pciBridge7.virtualDev = "pcieRootPort"
pciBridge7.functions = "8"
vmci0.present = "TRUE"
hpet0.present = "TRUE"
floppy0.present = "FALSE"
memSize = "256"
scsi0.virtualDev = "lsilogic"
scsi0.present = "TRUE"
ide0:0.startConnected = "FALSE"
ide0:0.deviceType = "atapi-cdrom"
ide0:0.fileName = "CD/DVD drive 0"
ide0:0.present = "TRUE"
scsi0:0.deviceType = "scsi-hardDisk"
scsi0:0.fileName = "533b6fcf3fa6301aadcc2b168f3f999a.vmdk"
scsi0:0.present = "TRUE"
displayName = "533b6fcf3fa6301aadcc2b168f3f999a"
annotation = "systemvmtemplate-4.11.2.0-vmware"
guestOS = "otherlinux-64"
toolScripts.afterPowerOn = "TRUE"
toolScripts.afterResume = "TRUE"
toolScripts.beforeSuspend = "TRUE"
toolScripts.beforePowerOff = "TRUE"
uuid.bios = "42 02 f1 40 33 e8 de e5-1a c5 93 2a c9 12 47 61"
vc.uuid = "50 02 5b d9 e9 c9 77 86-28 3e 84 00 22 2b eb d3"
firmware = "bios"
migrate.hostLog = "533b6fcf3fa6301aadcc2b168f3f999a-7d5d73de.hlog"


------------

That's all the data on the template VMDK.

Much appreciate your time!

Yiping



On 6/4/19, 9:29 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    Have you tried deleting template from PS and let ACS to recopy it again? If the issue is reproducible we can try to look what is wrong with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk , 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes copying template). Also from one of your ESX hosts output of this
    ls -al /vmfs/volumes
    ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their equitant after ACS finishes copying template)
    
     Can you also post management server log starting from the point you unregister and delete template from the vCenter.
    
    On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        I have manually imported the OVA to vCenter and successfully cloned a VM instance with it, on the same NFS datastore.
        
        
        On 6/4/19, 8:25 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
        
            I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
            vmware.create.full.clone 			false
            
            Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.
            
            
            
            On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
            
                Hi, list:
                
                I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
                
                Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
                
                
                2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
                
                2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
                
                2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
                
                2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                
                2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
                
                Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
                
                
                
                If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
                
                Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
                
                The full management server log is uploaded as https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C43145a4c1efe49f88f1708d6e909cfb1%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952625685144436&amp;sdata=N0ySSBNguim7OoX2i1PpD2cLlLLn2HJcL8wVKSywAH0%3D&amp;reserved=0
                
                Any help or insight on what went wrong here are much appreciated.
                
                Thanks
                
                Yiping
                
            
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
Have you tried deleting template from PS and let ACS to recopy it again? If the issue is reproducible we can try to look what is wrong with VMDK. Please post content of 533b6fcf3fa6301aadcc2b168f3f999a.vmdk , 533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk and 533b6fcf3fa6301aadcc2b168f3f999a.vmx (their equitant after ACS finishes copying template). Also from one of your ESX hosts output of this
ls -al /vmfs/volumes
ls -al /vmfs/volumes/*/533b6fcf3fa6301aadcc2b168f3f999a (their equitant after ACS finishes copying template)

 Can you also post management server log starting from the point you unregister and delete template from the vCenter.

On 6/4/19, 8:37 AM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    I have manually imported the OVA to vCenter and successfully cloned a VM instance with it, on the same NFS datastore.
    
    
    On 6/4/19, 8:25 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:
    
        I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
        vmware.create.full.clone 			false
        
        Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.
        
        
        
        On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
        
            Hi, list:
            
            I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
            
            Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
            
            
            2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
            
            2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
            
            2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
            
            2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            
            2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
            
            Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
            
            
            
            If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
            
            Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
            
            The full management server log is uploaded as https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C1541cabf6c9b42873e0708d6e900da78%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952587209705577&amp;sdata=qDALhWV4NoLWyidmyz6f1MUSERFWb9XN%2B7GtaAnExaU%3D&amp;reserved=0
            
            Any help or insight on what went wrong here are much appreciated.
            
            Thanks
            
            Yiping
            
        
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Yiping Zhang <yi...@adobe.com.INVALID>.
I have manually imported the OVA to vCenter and successfully cloned a VM instance with it, on the same NFS datastore.


On 6/4/19, 8:25 AM, "Sergey Levitskiy" <se...@hotmail.com> wrote:

    I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
    vmware.create.full.clone 			false
    
    Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.
    
    
    
    On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:
    
        Hi, list:
        
        I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
        
        Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
        
        
        2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
        
        2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
        
        2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
        
        2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        
        2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
        
        Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
        
        
        
        If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
        
        Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
        
        The full management server log is uploaded as https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2Fc05wiQ3R&amp;data=02%7C01%7Cyipzhang%40adobe.com%7C1541cabf6c9b42873e0708d6e900da78%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C636952587209705577&amp;sdata=qDALhWV4NoLWyidmyz6f1MUSERFWb9XN%2B7GtaAnExaU%3D&amp;reserved=0
        
        Any help or insight on what went wrong here are much appreciated.
        
        Thanks
        
        Yiping
        
    
    


Re: Can't start systemVM in a new advanced zone deployment

Posted by Sergey Levitskiy <se...@hotmail.com>.
I would suspect the template is corrupted on the secondary storage. You can try disabling/enabling link clone feature and see if it works the other way. 
vmware.create.full.clone 			false

Also systemVM template might have been generated on a newer version of vSphere and not compatible with ESXi 6.5. What you can do to validate this is to manually deploy OVA that is in Secondary storage and try to spin up VM from it directly in vCenter.



On 6/3/19, 5:41 PM, "Yiping Zhang" <yi...@adobe.com.INVALID> wrote:

    Hi, list:
    
    I am struggling with deploying a new advanced zone using ACS 4.11.2.0 + vSphere 6.5 + NetApp volumes for primary and secondary storage devices. The initial setup of CS management server, seeding of systemVM template, and advanced zone deployment all went smoothly.
    
    Once I enabled the zone in web UI and the systemVM template gets copied/staged on to primary storage device. But subsequent VM creations from this template would fail with errors:
    
    
    2019-06-03 18:38:15,764 INFO  [c.c.h.v.m.HostMO] (DirectAgent-7:ctx-d01169cb esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VM 533b6fcf3fa6301aadcc2b168f3f999a not found in host cache
    
    2019-06-03 18:38:17,017 INFO  [c.c.h.v.r.VmwareResource] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) VmwareStorageProcessor and VmwareStorageSubsystemCommandHandler successfully reconfigured
    
    2019-06-03 18:38:17,128 INFO  [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) creating full clone from template
    
    2019-06-03 18:38:17,657 INFO  [c.c.h.v.u.VmwareHelper] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) [ignored]failed toi get message for exception: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    
    2019-06-03 18:38:17,658 ERROR [c.c.s.r.VmwareStorageProcessor] (DirectAgent-4:ctx-08b54fbd esx-0001-a-001.example.org, job-3/job-29, cmd: CopyCommand) clone volume from base image failed due to Exception: java.lang.RuntimeException
    
    Message: Error caused by file /vmfs/volumes/afc5e946-03bfe3c2/533b6fcf3fa6301aadcc2b168f3f999a/533b6fcf3fa6301aadcc2b168f3f999a-000001.vmdk
    
    
    
    If I try to create “new VM from template” (533b6fcf3fa6301aadcc2b168f3f999a) on vCenter UI manually,  I will receive exactly the same error message. The name of the VMDK file in the error message is a snapshot of the base disk image, but it is not part of the original template OVA on the secondary storage.  So, in the process of copying the template from secondary to primary storage, a snapshot got created and the disk became corrupted/unusable.
    
    Much later in the log file,  there is another error message “failed to fetch any free public IP address” (for ssvm, I think).  I don’t know if these two errors are related or if one is the root cause for the other error.
    
    The full management server log is uploaded as https://pastebin.com/c05wiQ3R
    
    Any help or insight on what went wrong here are much appreciated.
    
    Thanks
    
    Yiping