Create image of VyOS 1.2.0 for Amazon Web Services
Open, NormalPublic

Description

Provide a public AMI of VyOS 1.2.0

Conversation started in T100

Details

Difficulty level
Hard (possibly days)
Version
-
Why the issue appeared?
Will be filled on close

Question from @silverbp :>>! In T100#4449, @silverbp wrote:

with ubuntu 14.04 I still get the error above.
mount: wrong fs type, bad option, bad superblock on overlayfs,

Running through those packer steps. I don't understand disk partitioning enough on this to be able to work through that.

@silverbp are you referring to branch issue-1-support-vyos-1.2?
I haven't merged this to master because it doesn't generate a working 1.2.0 yet and would break the existing working 1.1.7 configuration.

In the branch, I added the missing fstype to all the mount commands and this part of the process passes.

The problem I hit now with Ubuntu 14.04 is that /bin/cli-shell-api setupSession dumps core when executed inside chroot.
Trying to use Debian Jessie fails because it doesn't have 3.18 kernel which is required for OverlayFS support.
Upgrading Debian Jessie AMI with kernel 4.9 from backports results in a non-booting AMI (well it boots but doesn't start ssh daemon or output the boot logs).

I'll try to use Ubuntu 16.04 next time I get around to try it.

I managed to get past the OverlayFS issue by adding:

mkdir /mnt/wroot/workdir

And adding workdir=/mnt/wroot/workdir to the OverlayFS mount command.

But I still get the core dump. I suspect something to do with UnionFS not being supported, maybe?

# gdb /bin/cli-shell-api
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /bin/cli-shell-api...(no debugging symbols found)...done.
(gdb) run setupSession
Starting program: /bin/cli-shell-api setupSession
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::erase: __pos (which is 18446744073709551615) > this->size() (which is 0)

Program received signal SIGABRT, Aborted.
0x00007ffff69cf067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56	../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) where
#0  0x00007ffff69cf067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff69d0448 in __GI_abort () at abort.c:89
#2  0x00007ffff72bcb3d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff72babb6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff72bac01 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff72bae19 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7310cdf in std::__throw_out_of_range_fmt(char const*, ...) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff731b2d1 in std::string::erase(unsigned long, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007ffff7b73c98 in cstore::unionfs::UnionfsCstore::setupSession() () from /usr/lib/libvyatta-cfg.so.1
#9  0x000000000040968d in ?? ()
#10 0x000000000040b3fc in ?? ()
#11 0x00007ffff69bbb45 in __libc_start_main (main=0x40b04a, argc=2, argv=0x7fffffffe1e8, init=<optimized out>, fini=<optimized out>,
    rtld_fini=<optimized out>, stack_end=0x7fffffffe1d8) at libc-start.c:287
#12 0x0000000000408e59 in ?? ()
(gdb) quit

The above was taken from running inside chroot under Ubuntu 16.04. I'll try to burn and boot from that disk image later.

syncer claimed this task.Feb 7 2017, 3:17 PM

Trying to use Debian Jessie fails because it doesn't have 3.18 kernel which is required for OverlayFS support.

I have no idea what you are doing, but: 1) overlayfs is used for the livecd union mount and NOT for config sessions 2) for config sessions, a userspace implementation of unionfs (unionfs-fuse) is used.

syncer changed Difficulty level from Easy (less than an hour) to Hard (possibly days).
syncer reassigned this task from syncer to amos.shapira.
syncer added subscribers: VyOS 1.2.x, AWS Support.
syncer added a subscriber: syncer.
amos.shapira added a comment.EditedFeb 7 2017, 7:24 PM

I'm building the VyOS disk image under another OS (latest test using Ubuntu 16.04 LTS) and was testing the extra configuration script under chroot. Your response would explain that the missing unionfs packages on the host OS could explain the error I get. I'll install them and try again.

@amos.shapira @silverbp
to save some additional time for you here in T100
You must use Debian Jessie to build 1.2.x

I'm not building the VyOS from source but extract it from the nightly .iso file.
This works great for me with VyOS 1.1.7.
Debian Jessie doesn't have the kernel 3.18 which is required for OverlayFS.

@amos.shapira I was just working through the packer script from the original script you gave me. I'll wait to see what you work out and see if I can apply something similar to GCE

Just posting here in case someone can help me out -

I manage to create a booting VyOS AMI 1.2.0 but it fails to fetch the ssh key and configure ssh daemon due to (as far as I can tell) "setupSession" aborting on a failed assert(3).

The next two steps I intend to try:

  1. Configure the AMI to run sshd outside the VyOS configuration just to let me access the setupSession core file
  2. Need to get debug symbols for setupSession to try to understand what the variable it aborts on is supposed to hold.

Any tips to help do this would be welcome.

Just posting here in case someone can help me out -

I manage to create a booting VyOS AMI 1.2.0 but it fails to fetch the ssh key and configure ssh daemon due to (as far as I can tell) "setupSession" aborting on a failed assert(3).

The next two steps I intend to try:

  1. Configure the AMI to run sshd outside the VyOS configuration just to let me access the setupSession core file
  2. Need to get debug symbols for setupSession to try to understand what the variable it aborts on is supposed to hold.

    Any tips to help do this would be welcome.

Hi Amos,

Did you find out that way to make it work? Or have been able to upgrade it from a previous instance? We had implemented a transit VPC but we are having troubles with out of order packets in our Bgp OSPF multipath architecture and we need to give it a try the fixed implementation of ECMP on the 4.4 kernel.

Best regards,

Julian

Hi Julian,

Sorry but I didn't get around to look at this further.

Thanks Amos.

We were able to do it manually loading the 1.2 developer image to the AWS public image, but doing new images over the updated one doesn't work.

The good news is the ECMP hash based routing works well over OSPF, but things like openvpn, VRRP and may be others does not work properly.

Julian

higebu added a subscriber: higebu.Jul 26 2017, 4:20 PM

I created initial script for making AMI with importing VMDK. I think this way is more useful for multiple region support.
The commit is here: https://github.com/vyos/vyos-build/commit/3dc5653bbb8133f976e4603c8f78a5f6cf686405
What do you think?

amos.shapira added a comment.EditedJul 26 2017, 8:50 PM

That's one way to get it done.
You can also support multiple regions by copying the uploaded AMI using aws ec2 copy-image if you like.
I left comments suggesting improvement by using awss own --query flag to extract the output instead of filtering through jq (thus reducing the need for yet another tool).
I gave instructions in the comment as much as I can without testing the specific command.

syncer triaged this task as Normal priority.Aug 1 2017, 3:54 AM
syncer changed the edit policy from "Task Author" to "Custom Policy".
syncer set Version to -.
syncer edited subscribers, added: Maintainers, Core Community; removed: higebu, syncer, AWS Support and 3 others.