Quantcast
Channel: DBA Consulting Blog
Viewing all 117 articles
Browse latest View live

Microsoft touts SQL Server 2017 as 'first RDBMS with built-in AI'

$
0
0

The 2017 Microsoft Product Roadmap

Many key Microsoft products reached significant milestones in 2016, with next-gen versions of SharePoint Server, SQL Server and Windows Server all being rolled out alongside major updates to the Dynamics portfolio and, of course, Windows. This year's product roadmap looks to be a bit less crowded, though major changes are on tap for Microsoft's productivity solutions, while Windows 10 is poised for another landmark update. Here's what to watch for in the coming months 


With a constantly changing, and increasingly diversifying IT landscape– particularly in terms of heterogeneous operating systems (Linux, Windows, etc.) - IT organizations must contend with multiple data types, different development languages, and a mix of on-premises/cloud/hybrid environments, and somehow simultaneously reduce operational costs. To enable you to choose the best platform for your data and applications, SQL Server is bringing its world-class RDBMS to Linux and Windows with SQL Server v.Next.



You will learn more about the SQL Server on Linux offering and how it provides a broader range of choice for all organizations, not just those who want to run SQL on Windows. It enables SQL Server to run in more private, public, and hybrid cloud ecosystems, to be used by developers regardless of programming languages, frameworks or tools, and further empowers ‘every person and every organization on the planet to achieve more.’

Bootcamp 2017 - SQL Server on Linux


Learn More about:

  • What’s next for SQL Server on Linux
  • The Evolution and Power of SQL Server 2016
  • Enabling DevOps practices such as Dev/Test and CI/CD  with containers
  • What is new with SQL Server 2016 SP1: Enterprise class features in every edition
  • How to determine which SQL Server edition to deploy based on operation need, not feature set

SQL Server on Linux: High Availability and security on Linux


Why Microsoft for your operational database management system?

When it comes to the systems you choose for managing your data, you want performance and security that won't get in the way of running your business. As an industry leader in operational database management systems (ODBMS), Microsoft continuously improves its offerings to help you get the most out of your ever-expanding data world.

Read Gartner’s assessment of the ODBMS landscape and learn about the Microsoft "cloud first" strategy. In its latest Magic Quadrant report for ODBMS, Gartner positioned the Microsoft DBMS furthest in completeness of vision and highest for ability to execute. Gartner Reprint of SQL Server 2017

Top Features Coming to SQL Server 2017
From Python to adaptive query optimization to the many cloud-focused changes (not to mention Linux!), Joey D'Antoni takes you through the major changes coming to SQL Server 2017.

Top three capabilities to get excited about in the next version of SQL Server

Microsoft announced the first public preview of SQL Server v.Next in November 2016, and since then we’ve had lots of customer interest, but a few key scenarios are generating the most discussion.

If you’d like to learn more about SQL Server v.Next on Linux and Windows, please join us for the upcoming Microsoft Data Amp online event on April 19 at 8 AM Pacific. It will showcase how data is the nexus between application innovation and intelligence—how data and analytics powered by the most trusted and intelligent cloud can help companies differentiate and out-innovate their competition.

In this blog, we discuss three top things that customers are excited to do with the next version of SQL Server.

1. Scenario 1: Give applications the power of SQL Server on the platform of your choice

With the upcoming availability of SQL Server v.Next on Linux, Windows, and Docker, customers will have the added flexibility to build and deploy more of their applications on SQL Server. In addition to Windows Server and Windows 10, SQL Server v.Next supports Red Hat Enterprise Linux (RHEL), Ubuntu, and SUSE Linux Enterprise Server (SLES). SQL Server v.Next also runs on Linux and Windows Docker containers opening up even more possibilities to run on public and private cloud application platforms like Kubernetes, OpenShift, Docker Swarm, Mesosphere DC/OS, Azure Stack, and Open Stack. Customers will be able to continue to leverage existing tools, talents, and resources for more of their applications.


Some of the things customers are planning for SQL Server v.Next on Windows, Linux, and Docker include migrating existing applications from other databases on Linux to SQL Server; implementing new DevOps processes using Docker containers; developing locally on the dev machine of choice, including Windows, Linux, and macOS; and building new applications on SQL Server that can run anywhere—on Windows, Linux, or Docker containers, on-premises, and in the cloud.

SQL Server on Linux - march 2017


2. Scenario 2: Faster performance with minimal effort

SQL Server v.Next further expands the use cases supported by SQL Server’s in-memory capabilities, In-Memory OLTP and In-Memory ColumnStore. These capabilities can be combined on a single table delivering the best Hybrid Transactional and Analytical Processing (HTAP) performance available in any database system. Both in-memory capabilities can yield performance improvements of more than 30x, enabling the possibility to perform analytics in real time on operational data.

In v.Next natively compiled stored procedures (In-memory OLTP) now support JSON data as well as new query capabilities. For the column store both building and rebuilding a nonclustered column store can now be done online. Another critical addition to the column store is support for LOBs (Large Objects).

SQL Server on Linux 2017


With these additions, the parts of an application that can benefit from the extreme performance of SQL Server’s in-memory capabilities have been greatly expanded! We also introduced a new set of features that learn and adapt from an application’s query patterns over time without requiring actions from your DBA.


3. Scenario 3: Scale out your analytics

In preparation for the release of SQL Server v.Next, we are enabling the same High Availability (HA) and Disaster Recovery (DR) solutions on all platforms supported by SQL Server, including Windows and Linux. Always On Availability Groups is SQL Server’s flagship solution for HA and DR. Microsoft has released a preview of Always On Availability Groups for Linux in SQL Server v.Next Community Technology Preview (CTP) 1.3.

SQL Server Always On availability groups can have up to eight readable secondary replicas. Each of these secondary replicas can have their own replicas as well. When daisy chained together, these readable replicas can create massive scale-out for analytics workloads. This scale-out scenario enables you to replicate around the globe, keeping read replicas close to your Business Analytics users. It’s of particularly big interest to users with large data warehouse implementations. And, it’s also easy to set up.

In fact, you can now create availability groups that span Windows and Linux nodes, and scale out your analytics workloads across multiple operating systems.

In addition, a cross-platform availability group can be used to migrate a database from SQL Server on Windows to Linux or vice versa with minimal downtime. You can learn more about SQL Server HA and DR on Linux by reading the blog SQL Server on Linux: Mission-critical HADR with Always On Availability Groups Mission Critical HADR.

To find out more, you can watch our SQL Server on Linux webcast Linux Webinars . Find instructions for acquiring and installing SQL Server v.Next on the operating system of your choice at www.microsoft.com/sqlserveronlinux http://www.microsoft.com/sqlserveronlinux . To get your SQL Server app on Linux faster, you can nominate your app for the SQL Server on Linux Early Adopter Program, or EAP. Sign up now to see if your application qualifies for technical support, workload validation, and help moving your application to production on Linux before general availability.

To find out more about SQL Server v.Next and get all the latest announcements, register now to attend Microsoft Data Amp—where Data Amp—- where Data data gets to work.

Microsoft announced the name and many of the new features in the next release of SQL Server at its Data Amp Virtual Event on Wednesday. While SQL Server 2017 may not have as comprehensive of a feature set as SQL Server 2016, there is still some big news and very interesting new features. The reason for this is simple -- the development cycle for SQL Server 2017 is much shorter than the SQL Server 2016 development cycle. The big news at Wednesday's event is the release of SQL Server 2017 later this year on both Windows and Linux operating systems.

Microsoft Data Platform Airlift 2017 Rui Quintino Machine Learning with SQL Server 2016 and R Services


I was able to quickly download the latest Linux release on Docker and have it up and running on my Mac during today's briefing. (I have previously written about the Linux release here.) That speed to development is one of the major benefits of Docker that Microsoft hopes developers will leverage when building new applications. Docker is just one of many open source trends we have seen Microsoft adopt in recent years with SQL Server. Wednesday's soft launch not only introduced SQL on Linux, but also includes Python support, a new graph engine and a myriad of other features.

First R, Now Python
One of the major features of SQL Server 2016 was the integration of R, an open source statistical analysis language, into the SQL Server database engine. Users can use the sp_execute_external_script stored procedure to run R code that takes advantage of parallelism in the database engine. Savvy users of this procedure might notice the first parameter of this stored procedure is @language. Microsoft designed this stored procedure to be open-ended, and now adds Python as the second language that it supports. Python combines powerful scripting with eminent readability and is broadly used by IT admins, developers, data scientists, and data analysts. Additionally, Python can leverage external statistical packages to perform data manipulation and statistical analysis. When you combine this capability with Transact-SQL (T-SQL), the result is powerful.

SQL Server 2017: Advanced Analytics with Python
In this session you will learn how SQL Server 2017 takes in-database analytics to the next level with support for both Python and R; delivering unparalleled scalability and speed with new deep learning algorithms built in. Download SQL Server 2017: https://aka.ms/sqlserver17linuxyt




Big Changes to the Cloud
It is rare for a Microsoft launch event to omit news about cloud services, and Wednesday's event was no exception. Microsoft Azure SQL Database (formerly known as SQL Azure), which is the company's Database as a Service offering, has always lacked complete compatibility with the on-premises (or in an Azure VM) version of SQL Server. Over time, compatibility has gotten much better, but there are still gaps such as unsupported features like SQL CLR and cross-database query.

SQL Server 2017: Security on Linux


The new solution to this problem is a hybrid Platform as a Server (PaaS)/Infrastructure as a Service (IaaS) solution that is currently called Azure Managed Instances. Just as with Azure SQL Database, the Managed Instances administrator is not responsible for OS and patching operations. However, the Managed Instances solution supports many features and functions that are not currently supported in SQL Database. One such new feature is the cross-database query capability. In an on-premises environment, multiple databases commonly exist on the same instance, and a single query can reference separate databases by using database.schema.table notation. In SQL Database, it is not possible to reference multiple databases in one query which has limited many migrations to the platform due to the amount of code that must be rewritten. Support for cross-database queries in Managed Instances simplifies the process of migrating applications to Azure PaaS offerings, and should thereby increase the number of independent software vendor (ISV) applications that can run in PaaS.

SQL Server 2017: HA and DR on Linux


SQL Server 2017: Adaptive Query Processing


Microsoft also showcased some of the data protection features in Azure SQL Database that are now generally available. Azure SQL Database Threat Detection detects SQL Injection, potential SQL Injection vulnerabilities, and anomalous login monitoring. This can simply be turned on at the SQL Database level by enabling auditing and configuring notifications. The administrator is then notified when the threat detection engine detects any anomalous behavior.

Graph Database
One of things I was happiest to see in SQL Server 2017 was the introduction of a graph database within the core database engine. Despite the name, relational databases struggle in managing relationships between data objects. The simplest example of this struggle is hierarchy management . In a classic relational structure, an organizational chart can be a challenge to model -- who does the CEO report to? With graph database support in SQL Server, the concept of nodes and edges is introduced. Nodes represent entities, edges represent relationships between any two given nodes, and both nodes and edges can be associated with data properties . SQL Server 2017 also uses extensions in the T-SQL language to support join-less queries that use matching to return related values.

SQL Server 2017: Building applications using graph data
Graph extensions in SQL Server 2017 will facilitate users in linking different pieces of connected data to help gather powerful insights and increase operational agility. Graphs are well suited for applications where relationships are important, such as fraud detection, risk management, social networks, recommendation engines, predictive analysis, dependence analysis, and IoT applications. In this session we will demonstrate how you can use SQL Graph extensions to build your application using graph data. Download SQL Server 2017: Now on Windows, Linux, and Docker https://www.microsoft.com/en-us/sql-server/sql-server-vnext-including-Linux



Graph databases are especially useful in Internet of Things (IoT), social network, recommendation engine, and predictive analytics applications. It should be noted that many vendors have been investing in graph solutions in recent years. Besides Microsoft, IBM and SAP have also released graph database features in recent years.

Adaptive Query Plans
One the biggest challenges of a DBA is managing system performance over time. As data changes, the query optimizer generates new execution plans which at times might be less than optimal . With Adaptive Query Optimization in SQL Server 2017, SQL Server can evaluate the runtime of a query and compare the current execution to the query's history, building on some of the technology that was introduced in the Query Store feature in SQL Server 2016 . For the next run of the same query, Adaptive Query Optimization can then improve the execution plan .

Because a change to an execution plan  that is based on one slow execution can have a dramatically damaging effect on system performance, the changes made by Adaptive Query Optimization are incremental and conservative. Over time, this feature handles the tuning a busy DBA may not have time to perform. This feature also benefits from Microsoft's management of Azure SQL Database because the development team monitors the execution data and the improvements that adaptive execution plans make in the cloud. They can then optimize the process and flow for adaptive execution plans in future versions of the on-premises product.

Are You a Business Intelligence Pro?
SQL Server includes much more than the database engine. Tools like Reporting Services (SSRS) and Analysis Services (SSAS) have long been a core part of the value proposition of SQL Server. Reporting Services benefited from a big overhaul in SQL Server 2016, and more improvements are coming in SQL Server 2017 with on-premises support for storage of Power BI reports in a SSRS instance. This capability is big news to organizations who are cloud-averse for various reasons. In addition, SQL Server 2017 adds support for the Power Query data sources in SSAS tabular models to expand. This capability means tabular models can store data from a broader range of data sources than it currently supports, such as Azure Blob Storage and Web page data.

2017 OWASP SanFran March Meetup - Hacking SQL Server on Scale with PowerShell


And More...
Although it is only an incremental release, Microsoft has packed a lot of functionality into SQL Server 2017. I barely mentioned Linux in this article for a reason: From a database perspective SQL Server on Linux is simply SQL Server. Certainly, there are some changes in infrastructure, but your development experience in SQL Server, whether on Linux, Windows or Docker, is exactly the same.

Keep your environment always on with sql server 2016 sql bits 2017


From my perspective, the exciting news is not just the new features that are in this version, but also the groundwork for feature enhancements down the road. Adaptive query optimization will get better over time, as will the graph database feature which you can query by using standard SQL syntax. Furthermore, the enhancements to Azure SQL Database with managed instances should allow more organizations to consider adoption of the database as a service option. In general, I am impressed with Microsoft's ability to push the envelope on database technology so shortly after releasing SQL Server 2016.

Nordic infrastructure Conference 2017 - SQL Server on Linux Overview



You can get started with the CTP by downloading the package for Docker, https://hub.docker.com/r/microsoft/mssql-server-windows/ or the Linux, https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-setup-red-hat platforms, or you can download the Windows release here https://www.microsoft.com/evalcenter/evaluate-sql-server-vnext-ctp .

More Information:

https://www.microsoft.com/en-us/sql-server/sql-server-2017

https://rcpmag.com/articles/2011/02/01/the-2011-microsoft-product-roadmap.aspx

https://adtmag.com/articles/2017/04/19/sql-server-2017.aspx

https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/19/sql-server-2017-community-technology-preview-2-0-now-available/

http://info.microsoft.com/rs/157-GQE-382/images/EN-CNTNT-SQL_Server_on_Linux_Public_Preview_Technical_Whitepaper-en-us.pdf

https://info.microsoft.com/SQL-Server-on-Linux-Open-source-enterprise-environment.html

https://info.microsoft.com/SQL-Server-on-Linux-Open-source-enterprise-environment.html

https://info.microsoft.com/CO-SQL-CNTNT-FY16-09Sep-14-MQOperational-Register.html?ls=website

https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/20/graph-data-processing-with-sql-server-2017/

https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/20/resumable-online-index-rebuild-is-in-public-preview-for-sql-server-2017-ctp-2-0/

https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/19/python-in-sql-server-2017-enhanced-in-database-machine-learning/


KVM (Kernel Virtual Machine) or Xen? Choosing a Virtualization Platform

$
0
0

KVM versus Xen which should you choose?

KVM (Kernel Virtual Machine)

KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko.

Virtualization Architecture & KVM




Using KVM, one can run multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualized hardware: a network card, disk, graphics adapter, etc.

Virtualization Platform Smackdown: VMware vs. Microsoft vs. Red Hat vs. Citrix



KVM is open source software. The kernel component of KVM is included in mainline Linux, as of 2.6.20. The userspace component of KVM is included in mainline QEMU, as of 1.3.

Blogs from people active in KVM-related virtualization development are syndicated at http://planet.virt-tools.org/


KVM-Features

This is a possibly incomplete list of KVM features, together with their status. Feel free to update any of them as you see fit.

Hypervisors and Virtualization - VMware, Hyper-V, XenServer, and KVM




As a guideline, there is a feature description template in here:

  • QMP - Qemu Monitor Protocol
  • KSM - Kernel Samepage Merging
  • Kvm Paravirtual Clock - A Paravirtual timesource for KVM
  • CPU Hotplug support - Adding cpus on the fly
  • PCI Hotplug support - Adding pci devices on the fly
  • vmchannel - Communication channel between the host and guests
  • migration - Migrating Virtual Machines
  • vhost -
  • SCSI disk emulation -
  • Virtio Devices -
  • CPU clustering -
  • hpet -
  • Device assignment -
  • pxe boot -
  • iscsi boot -
  • x2apic -
  • Floppy -
  • CDROM -
  • USB -
  • USB host device passthrough -
  • Sound -
  • Userspace irqchip emulation -
  • Userspace pit emulation -
  • Balloon memory driver -
  • Large pages support -
  • Stable Guest ABI -

Xen Hypervisor



The Xen hypervisor was first created by Keir Fraser and Ian Pratt as part of the Xenoserver research project at Cambridge University in the late 1990s. A hypervisor "forms the core of each Xenoserver node, providing the resource management, accounting and auditing that we require." The earliest web page dedicated to the Xen hypervisor is still available on Cambridge web servers.  The early Xen history can easily be traced through a variety of academic papers from Cambridge University. Controlling the XenoServer Open Platform is an excellent place to begin in understanding the origins of the Xen hypervisor and the XenoServer project. Other relevant research papers can be found at:



Xen and the Art of Virtualization - Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield. Puplished at SOSP 2003
Xen and the Art of Repeated Research - Bryan Clark, Todd Deshane, Eli Dow, Stephen Evanchik, Matthew Finlayson, Jason Herne, Jenna Neefe Matthews. Clarkson University. Presented at FREENIX 2004





  • Safe Hardware Access with the Xen Virtual Machine Monitor - Keir Fraser, Steven Hand, Rolf Neugebauer, Ian Pratt, Andrew Warfield, Mark Williamson. Published at OASIS ASPLOS 2004 Workshop
  • Live Migration of Virtual Machines - Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield. Published at NSDI 2005
  • Ottawa Linux Symposium 2004 Presentation
  • Linux World 2005 Virtualization BOF Presentation - Overview of Xen 2.0, Live Migration, and Xen 3.0 Roadmap
  • Xen Summit 3.0 Status Report - Cambridge 2005
  • Introduction to the Xen Virtual Machine - Rami Rosen, Linux Journal. Sept 1, 2005
  • Virtualization in Xen 3.0 - Rami Rosen, Linux Journal. March 2, 2006
  • Xen and the new processors - Rami Rosen, Lwn.net. May 2, 2006

Over the years, the Xen community has hosted several Xen Summit events where the global development community meets to discuss all things Xen. Many presentations and videos of those events are available here.

Why Xen Project?

The Xen Project team is a global open source community that develops the Xen Project Hypervisor and its associated subprojects.  Xen (pronounced /’zɛn/) Project has its origins in the ancient greek term Xenos (ξένος), which can be used to refer to guest-friends whose relationship is constructed under the ritual of xenia ("guest-friendship"), which in term is a wordplay on the idea of guest operating systems as well as a community of developers and users. The original website was created in 2003 to allow a global community of developers to contribute and improve the hypervisor.  Click on the link to find more about the projects’s interesting history.

Virtualization and Hypervisors




The community supporting the project follows a number of principles: Openess, Transparency, Meritocracy and Consensus Decision Making. Find out more about how the community governs itself.

What Differentiates the Xen Project Software?

Xen and the art of embedded virtualization (ELC 2017)



There are several virtualization technologies available in the world today. Our Xen Project virtualization and cloud software includes many powerful features which make it an excellent choice for many organizations:

Supports multiple guest operating systems: Linux, Windows, NetBSD, FreeBSDA virtualization technology which only supports a few guest operating systems essentially locks the organization into those choices for years to come. With our hypervisor, you have the flexibility to use what you need and add other operating system platforms as your needs dictate. You are in control.

VMware Alternative: Using Xen Server for Virtualization


Supports multiple Cloud platforms: CloudStack, OpenStackA virtualization technology which only supports one Cloud technology locks you into that technology. With the world of the Cloud moving so quickly, it could be a mistake to commit to one Cloud platform too soon. Our software keeps your choices open as Cloud solutions continue to improve and mature.
Reliable technology with a solid track recordThe hypervisor has been in production for many years and is the #1 Open Source hypervisor according to analysts such as Gartner. Conservative estimates show that Xen has an active user base of 10+ million: these are users, not merely hypervisor installations which are an order of magnitude higher. Amazon Web Services alone runs ½ million virtualized Xen Project instances according to a recent study and other cloud providers such as Rackspace and hosting companies use the hypervisor at extremely large scale. Companies such as Google and Yahoo use the hypervisor at scale for their internal infrastructure. Our software is the basis of successful commercial products such as Citrix XenServer and Oracle VM, which support an ecosystem of more than 2000 commercially certified partners today. It is clear that many major industry players regard our software as a safe virtualization platform for even the largest clouds.

ScalabilityThe hypervisor can scale up to 4,095 host CPUs with 16Tb of RAM. Using Para Virtualization (PV), the hypervisor supports a maximum of 512 VCPUs with 512Gb RAM per guest. Using Hardware Virtualization (HVM), it supports a maximum of 128 VCPUs with 1Tb RAM per guest.

PerformanceXen tends to outperform other open source virtualization solutions in most configurations. Check out Ubuntu 15.10: KVM vs. Xen vs. VirtualBox Virtualization Performance (Phoronix, Oct 2015) for a recent benchmarks of Xen 4.6.

High-Performance Virtualization for HPC Cloud on Xen - Jun Nakajima & Tianyu Lan, Intel Corp.



SecuritySecurity is one of the major concerns when moving critical services to virtualization or cloud computing environments. The hypervisor provides a high level of security due to its modular architecture, which separates the hypervisor from the control and guest operating systems. The hypervisor itself is thin and thus provides a minimal attack surface. The software also contains the Xen Security Modules (XSM), which have been developed and contributed to the project by the NSA for ultra secure use-cases. XSM introduces control policy providing fine-grained controls over its domains and their interaction amongst themselves and the outside world. And, of course, it is also possible to use the hypervisor with SELinux. In addition, Xen’s Virtual Machine Introspection (VMI) subsystems make it the best hypervisor for security applications. For more information, see Virtual Machine Introspection with Xen and VM Introspection: Practical Applications.

Live patching the xen project hypervisor




The Xen Project also has a dedicated security team, which handles security vulnerabilities in accordance with our Security Policy. Unlike almost all corporations and even most open source projects, the Xen Project properly discloses, via an advisory, every vulnerability discovered in supported configurations. We also often publish advisories about vulnerabilities in other relevant projects, such as Linux and QEMU.

FlexibilityOur hypervisor is the most flexible hypervisor on the market, enabling you to tailor your installation to your needs. There are lots of choices and trade-offs that you can make. For example: the hypervisor works on older hardware using paravirtualization, on newer hardware using HVM or PV on HVM. Users can choose from three tool stacks (XL, XAPI & LIBVIRT), from an ecosystem of software complementing the project and choose the most suitable flavour of Linux and Unix operating system for their needs. Further, the project's flexible architecture enables vendors to create Xen-based products and services for servers, cloud, desktop in particular for ultra secure environments.

ModularityOur architecture is uniquely modular, enabling a degree of scalability, robustness, and security suitable even for large, critical, and extremely secure environments. The control functionality in our control domain can be divided into small modular domains running a minimal kernel and a driver, control logic or other functionality: we call this approach Domain Disaggregation. Disaggregated domains are conceptually similar to processes in an operating system. They can be started/ended on demand, without affecting the rest of the system. Disaggregated domains reduce attack surface and distribute bottlenecks.  It enables you to restart an unresponsive device driver without affecting your VMs.

Analysis of the Xen code review process: An example of software development analytics



VM MigrationThe software supports Virtual Machine Migration. This allows you to react to changing loads on your servers, protecting your workloads.
Open SourceOpen Source means that you have influence over the direction of the code. You are not at the mercy of some immovable external organization which may have priorities which do not align with your organization. You can participate and help ensure that your needs are heard in the process. And you never have to worry that some entity has decided to terminate the product for business reasons. An Open Source project will live as long as there are parties interested in advancing the software.

Multi-vendor supportThe project enjoys support from a number of major software and service vendors.  This gives end-users numerous places to find support, as well as numerous service providers to work with.  With such a rich commercial ecosystem around the project, there is plenty of interest in keeping the project moving forward to ever greater heights.

KVM or Xen? Choosing a Virtualization Platform

When Xen was first released in 2002, the GPL'd hypervisor looked likely to take the crown as the virtualization platform for Linux. Fast forward to 2010, and the new kid in town has displaced Xen as the virtualization of choice for Red Hat and lives in the mainline Linux kernel. Which one to choose? Read on for our look at the state of Xen vs. KVM.

Things in virtualization land move pretty fast. If you don't have time to keep up with the developments in KVM or Xen development, it's a bit confusing to decide which one (if either) you ought to choose. This is a quick look at the state of the market between Xen and KVM.

KVM and Xen

Xen is a hypervisor that supports x86, x86_64, Itanium, and ARM architectures, and can run Linux, Windows, Solaris, and some of the BSDs as guests on their supported CPU architectures. It's supported by a number of companies, primarily by Citrix, but also used by Oracle for Oracle VM, and by others. Xen can do full virtualization on systems that support virtualization extensions, but can also work as a hypervisor on machines that don't have the virtualization extensions.

KVM is a hypervisor that is in the mainline Linux kernel. Your host OS has to be Linux, obviously, but it supports Linux, Windows, Solaris, and BSD guests. It runs on x86 and x86-64 systems with hardware supporting virtualization extensions. This means that KVM isn't an option on older CPUs made before the virtualization extensions were developed, and it rules out newer CPUs (like Intel's Atom CPUs) that don't include virtualization extensions. For the most part, that isn't a problem for data centers that tend to replace hardware every few years anyway — but it means that KVM isn't an option on some of the niche systems like the SM10000 that are trying to utilize Atom CPUs in the data center.

If you want to run a Xen host, you need to have a supported kernel. Linux doesn't come with Xen host support out of the box, though Linux has been shipping with support to run natively as a guest since the 2.6.23 kernel. What this means is that you don't just use a stock Linux distro to run Xen guests. Instead, you need to choose a Linux distro that ships with Xen support, or build a custom kernel. Or go with one of the commercial solutions based on Xen, like Citrix XenServer. The problem is that those solutions are not entirely open source.

And many do build custom kernels, or look to their vendors to do so. Xen is running on quite a lot of servers, from low-cost Virtual Private Server (VPS) providers like Linode to big boys like Amazon with EC2. A TechTarget article demonstrates how providers that have invested heavily in Xen are not likely to switch lightly. Even if KVM surpasses Xen technically, they're not likely to rip and replace the existing solutions in order to take advantage of a slight technical advantage.

And KVM doesn't yet have the technical advantage anyway. Because Xen has been around a bit longer, it also has had more time to mature than KVM. You'll find some features in Xen that haven't yet appeared in KVM, though the KVM project has a lengthy TODO list that they're concentrating on. (The list isn't a direct match for parity with Xen, just a good idea what the KVM folks are planning to work on.) KVM does have a slight advantage in the Linux camp of being the anointed mainline hypervisor. If you're getting a recent Linux kernel, you've already got KVM built in. Red Hat Enterprise Linux 5.4 included KVM support and the company is dropping Xen support for KVM in RHEL 6.

This is, in part, an endorsement of how far KVM has come technically. Not only does Red Hat have the benefit of employing much of the talent behind KVM, there's the benefit of introducing friction to companies that have cloned Red Hat Enterprise Linux and invested heavily in Xen. By dropping Xen from the roadmap, they're forcing other companies to drop Xen or pick up maintenance of Xen and diverging from RHEL. This means extra engineering costs, requiring more effort for ISV certifications, etc.

KVM isn't entirely on par with Xen, though it's catching up quickly. It has matured enough that many organizations feel comfortable deploying it in production. So does that mean Xen is on the way out? Not so fast.

There Can Be Only One?

The choice of KVM vs. Xen is as likely to be dictated by your vendors as anything else. If you're going with RHEL over the long haul, bank on KVM. If you're running on Amazon's EC2, you're already using Xen, and so on. The major Linux vendors seem to be standardizing on KVM, but there's plenty of commercial support out there for Xen. Citrix probably isn't going away anytime soon.

It's tempting in the IT industry to look at technology as a zero sum game where one solution wins and another loses. The truth is that Xen and KVM are going to co-exist for years to come. The market is big enough to support multiple solutions, and there's enough backing behind both technologies to ensure that they do well for years to come.

Containers vs. Virtualization: The new Cold War?


More Information:

https://sites.google.com/site/virtualizationtestingframework/

http://www.serverwatch.com/server-trends/slideshows/top-10-virtualization-technology-companies-for-2016.html

https://xenproject.org/users/why-the-xen-project.html

http://planet.virt-tools.org/

https://www.linux-kvm.org/page/KVM_Features

https://lwn.net/Articles/705160/

https://xenproject.org/about/history.html

https://www.linux-kvm.org/page/Guest_Support_Status

https://www.linux-kvm.org/page/Management_Tools

https://www.linux-kvm.org/page/HOWTO

https://xenserver.org/overview-xenserver-open-source-virtualization/open-source-virtualization-features.html

https://blog.xenproject.org/category/releases/

https://wiki.xenproject.org/wiki/Xen_Project_Release_Features

https://wiki.xen.org/wiki/Xen_Project_4.4_Feature_List

http://www.brendangregg.com/blog/2014-05-09/xen-feature-detection.html

https://onapp.com/2016/09/06/hypervisor-choice-xen-or-kvm/

https://www.suse.com/documentation/sles-12/singlehtml/book_virt/book_virt.html

https://www.linux.com/blogs/linux-foundation





Red Hat OpenShift and Orchestrating Containers With KUBERNETES!

$
0
0

Red Hat OpenShift and  Orchestrating Containers With KUBERNETES!

OVERVIEW

Kubernetes is a tool for orchestrating and managing Docker containers. Red Hat provides several ways you can use Kubernetes that include:

  • OpenShift Container Platform: Kubernetes is built into OpenShift, allowing you to configure Kubernetes, assign host computers as Kubernetes nodes, deploy containers to those nodes in pods, and manage containers across multiple systems. The OpenShift Container Platform web console provides a browser-based interface to using Kubernetes.
  • Container Development Kit (CDK): The CDK provides Vagrantfiles to launch the CDK with either OpenShift (which includes Kubernetes) or a bare-bones Kubernetes configuration. This gives you the choice of using the OpenShift tools or Kubernetes commands (such as kubectl) to manage Kubernetes.
  • Kubernetes in Red Hat Enterprise Linux: To try out Kubernetes on a standard Red Hat Enterprise Linux server system, you can install a combination of RPM packages and container images to manually set up your own Kubernetes configuration.

Resilient microservices with Kubernetes - Mete Atamel


Kubernetes, or k8s (k, 8 characters, s...get it?), or “kube” if you’re into brevity, is an open source platform that automates Linux container operations. It eliminates many of the manual processes involved in deploying and scaling containerized applications. In other words, you can cluster together groups of hosts running Linux containers, and Kubernetes helps you easily and efficiently manage those clusters. These clusters can span hosts across public, private, or hybrid clouds.

The Illustrated Children's Guide to Kubernetes


Kubernetes was originally developed and designed by engineers at Google. Google was one of the early contributors to Linux container technology and has talked publicly about how everything at Google runs in containers. (This is the technology behind Google’s cloud services.) Google generates more than 2 billion container deployments a week—all powered by an internal platform: Borg. Borg was the predecessor to Kubernetes and the lessons learned from developing Borg over the years became the primary influence behind much of the Kubernetes technology.

Fun fact: The seven spokes in the Kubernetes logo refer to the project’s original name, “Project Seven of Nine.”

Kubernetes & Container Engine


Red Hat was one of the first companies to work with Google on Kubernetes, even prior to launch, and has become the 2nd leading contributor to Kubernetes upstream project. Google donated the Kubernetes project to the newly formed Cloud Native Computing Foundation in 2015.

An Introduction to Kubernetes


Why do you need Kubernetes?

Real production apps span multiple containers. Those containers must be deployed across multiple server hosts. Kubernetes gives you the orchestration and management capabilities required to deploy containers, at scale, for these workloads. Kubernetes orchestration allows you to build application services that span multiple containers, schedule those containers across a cluster, scale those containers, and manage the health of those containers over time.


Kubernetes also needs to integrate with networking, storage, security, telemetry and other services to provide a comprehensive container infrastructure.

Of course, this depends on how you’re using containers in your environment. A rudimentary application of Linux containers treats them as efficient, fast virtual machines. Once you scale this to a production environment and multiple applications, it's clear that you need multiple, colocated containers working together to deliver the individual services. This significantly multiplies the number of containers in your environment and as those containers accumulate, the complexity also grows.

Hands on Kubernetes 


Kubernetes fixes a lot of common problems with container proliferation—sorting containers together into a ”pod.” Pods add a layer of abstraction to grouped containers, which helps you schedule workloads and provide necessary services—like networking and storage—to those containers. Other parts of Kubernetes help you load balance across these pods and ensure you have the right number of containers running to support your workloads.

With the right implementation of Kubernetes—and with the help of other open source projects like Atomic Registry, Open vSwitch, heapster, OAuth, and SELinux— you can orchestrate all parts of your container infrastructure.

What can you do with Kubernetes?

The primary advantage of using Kubernetes in your environment is that it gives you the platform to schedule and run containers on clusters of physical or virtual machines. More broadly, it helps you fully implement and rely on a container-based infrastructure in production environments. And because Kubernetes is all about automation of operational tasks, you can do many of the same things that other application platforms or management systems let you do, but for your containers.

Red Hat is Driving Kubernetes/Container Security Forward - Clayton Coleman


Kubernetes’ features provide everything you need to deploy containerized applications. Here are the highlights:


  • Container Deployments & Rollout Control. Describe your containers and how many you want with a “Deployment.” Kubernetes will keep those containers running and handle deploying changes (such as updating the image or changing environment variables) with a “rollout.” You can pause, resume, and rollback changes as you like.
  • Resource Bin Packing. You can declare minimum and maximum compute resources (CPU & Memory) for your containers. Kubernetes will slot your containers into where ever they fit. This increases your compute efficiency and ultimately lowers costs.
  • Built-in Service Discovery & Autoscaling. Kubernetes can automatically expose your containers to the internet or other containers in the cluster. It automatically load-balances traffic across matching containers. Kubernetes supports service discovery via environment variables and DNS, out of the box. You can also configure CPU-based autoscaling for containers for increased resource utilization.
  • Heterogeneous Clusters. Kubernetes runs anywhere. You can build your Kubernetes cluster for a mix of virtual machines (VMs) running the cloud, on-prem, or bare metal in your datacenter. Simply choose the composition according to your requirements.
  • Persistent Storage. Kubernetes includes support for persistent storage connected to stateless application containers. There is support for Amazon Web Services EBS, Google Cloud Platform persistent disks, and many, many more.
  • High Availability Features. Kubernetes is planet scale. This requires special attention to high availability features such as multi-master or cluster federation. Cluster federation allows linking clusters together so that if one cluster goes down containers can automatically move to another cluster.

These key features make Kubernetes well suited for running different application architectures from monolithic web applications, to highly distributed microservice applications, and even batch driven applications.

With Kubernetes you can:

  • Orchestrate containers across multiple hosts.
  • Make better use of hardware to maximize resources needed to run your enterprise apps.
  • Control and automate application deployments and updates.
  • Mount and add storage to run stateful apps.
  • Scale containerized applications and their resources on the fly.
  • Declaratively manage services, which guarantees the deployed applications are always running how you deployed them.
  • Health-check and self-heal your apps with autoplacement, autorestart, autoreplication, and autoscaling.

Kubernetes, however, relies on other projects to fully provide these orchestrated services. With the addition of other open source projects, you can fully realize the power of Kubernetes. These necessary pieces include (among others):


  • Registry, through projects like Atomic Registry or Docker Registry.
  • Networking, through projects like OpenvSwitch and intelligent edge routing.
  • Telemetry, through projects such as heapster, kibana, hawkular, and elastic.
  • Security, through projects like LDAP, SELinux, RBAC, and OAUTH with multi-tenancy layers.
  • Automation, with the addition of Ansible playbooks for installation and cluster life-cycle management.
  • Services, through a rich catalog of precreated content of popular app patterns.
  • Get all of this, prebuilt and ready to deploy, with Red Hat OpenShift

Container Management with OpenShift Red Hat - Open Cloud Day 2016



Learn to speak Kubernetes

Like any technology, there are a lot of words specific to the technology that can be a barrier to entry. Let's break down some of the more common terms to help you understand Kubernetes.

Master: The machine that controls Kubernetes nodes. This is where all task assignments originate.

Node: These machines perform the requested, assigned tasks. The Kubernetes master controls them.

Pod: A group of one or more containers deployed to a single node. All containers in a pod share an IP address, IPC, hostname, and other resources. Pods abstract network and storage away from the underlying container. This lets you move containers around the cluster more easily.

Replication controller:  This controls how many identical copies of a pod should be running somewhere on the cluster.

Service: This decouples work definitions from the pods. Kubernetes service proxies automatically get service requests to the right pod—no matter where it moves to in the cluster or even if it’s been replaced.

Kubelet: This service runs on nodes and reads the container manifests and ensures the defined containers are started and running.

kubectl: This is the command line configuration tool for Kubernetes.

Check out the Kubernetes Reference:https://kubernetes.io/docs/reference/

Using Kubernetes in production

Kubernetes is open source. And, as such, there’s not a formalized support structure around that technology—at least not one you’d trust your business on. If you had an issue with your implementation of Kubernetes, while running in production, you’re not going to be very happy. And your customers probably won’t, either.

Performance and Scalability Tuning Kubernetes for OpenShift and Docker by Jeremy Eder, Red Hat


That’s where Red Hat OpenShift comes in. OpenShift is Kubernetes for the enterprise—and a lot more. OpenShift includes all of the extra pieces of technology that makes Kubernetes powerful and viable for the enterprise, including: registry, networking, telemetry, security, automation, and services. With OpenShift, your developers can make new containerized apps, host them, and deploy them in the cloud with the scalability, control, and orchestration that can turn a good idea into new business quickly and easily.

Best of all, OpenShift is supported and developed by the #1 leader in open source, Red Hat.


Kubernetes runs on top of an operating system (Red Hat Enterprise Linux Atomic Host, for example) and interacts with pods of containers running on the nodes. The Kubernetes master takes the commands from an administrator (or DevOps team) and relays those instructions to the subservient nodes. This handoff works with a multitude of services to automatically decide which node is best suited for the task. It then allocates resources and assigns the pods in that node to fulfill the requested work.

So, from an infrastructure point of view, there is little change to how you’ve been managing containers. Your control over those containers happens at a higher level, giving you better control without the need to micromanage each separate container or node. Some work is necessary, but it’s mostly a question of assigning a Kubernetes master, defining nodes, and defining pods.


What about docker?
The docker technology still does what it's meant to do. When kubernetes schedules a pod to a node, the kubelet on that node will instruct docker to launch the specified containers. The kubelet then continuously collects the status of those containers from docker and aggregates that information in the master. Docker pulls containers onto that node and starts and stops those containers as normal. The difference is that an automated system asks docker to do those things instead of the admin doing so by hand on all nodes for all containers.


OpenStack Compute for Containers
While many customers are already running containers on Red Hat Enterprise Linux 7 as an OpenStack guest operating system, we are also seeing greater interest in Red Hat Enterprise Linux Atomic Host as a container-optimized guest OS option. And while most customers run their containers in guest VMs driven by Nova, we are also seeing growing interest in customers who want to integrate with OpenStack Ironic to run containers on bare metal hosts. With OpenStack, customers can manage both virtual and physical compute infrastructure to serve as the foundation for their container application workloads.

Earlier this year we also demonstrated how OpenStack administrators could use Heat to deploy a cluster of Nova instances running Kubernetes. The Heat templates contributed by Red Hat simplify the provisioning of new container host clusters, which are ready to run container workloads orchestrated by Kubernetes. Heat templates also serve at the foundation for OpenStack Magnum API to make container orchestration engines like Kubernetes available as first class resources in OpenStack. We also recently created Heat templates to deploy OpenShift 3 and added them to the OpenStack Community App Catalog. Our next step is to make elastic provisioning and deprovisioning of Kubernetes nodes based on resource demand a reality.

Building Clustered Applications with Kubernetes and Docker - Stephen Watt, Red Hat


Linux
Linux is at the foundation of OpenStack and modern container infrastructures. While we are excited to see Microsoft invest in Docker to bring containers to Windows, they are still Linux containers after all. Red Hat’s first major contribution was bringing containers to enterprise Linux and RPM-based distributions like Fedora, Red Hat Enterprise Linux and CentOS. Since then we launched Project Atomic and made available Red Hat Enterprise Linux Atomic Host as a lightweight, container-optimized, immutable Linux platform for enterprise customers. With the recent surge in new container-optimized Linux distributions being announced, we see this as more than just a short term trend. This year we plan to release Red Hat Enterprise Linux Atomic Host 7.2 and talk about how customers are using it as the foundation for a containerized application workloads.

Red Hat Container Strategy


Docker
Docker has defined the packaging format and runtime for containers, which has now become the defacto standard for the industry, as embodied in OCI and the runC reference implementation. Red Hat continues to contribute extensively to the Docker project and is now helping to drive governance of OCI and implementation of runC. We are committed to helping to make Docker more secure, both in the container runtime and content and working with our partners to enable customers to safely containerize their most mission critical applications.

Architecture Overview: Kubernetes with Red Hat Enterprise Linux 7.1


Kubernetes
Kubernetes is Red Hat’s choice for container orchestration and management and it is also seeing significant growth with more than 500 contributors and nearly 20,000 commits to the Kubernetes project in just over a year. While there is a lot of innovation in the container orchestration space, we see Kubernetes as another emerging standard given the combination of Google’s experience running container workloads at massive scale, Red Hat’s contributions and experience making open source work in enterprise environments, and the growing community surrounding it.

Microservices with Docker, Kubernetes, and Jenkins


This “LDK” stack is the foundation of Red Hat OpenShift 3 and Atomic Enterprise Platform announced recently at Red Hat Summit. It’s also the foundation of the Google Container Engine which is now generally available and other vendor and customer solutions that were featured recently at LinuxCon during the Kubernetes 1.0 launch.

Red Hat has helped drive innovation in this new Container stack while also driving integration with OpenStack. We have focused our efforts on integrating in the three core pillars of OpenStack – compute, networking and storage. Here’s how:

OpenStack Networking for Containers
Red Hat leverages Kubernetes networking model to enable networking across multiple containers, running across multiple hosts. In Kubernetes, each container (or “pod”) has its own IP address and can communicate with other containers/pods, regardless of which host they run on. Red Hat integrated RHEL Atomic Host with Flannel for container networking and also developed a new OVS-based SDN solution that is included in OpenShift 3 and Atomic Enterprise Platform. But in OpenStack environments, users may want to leverage Neutron and its rich ecosystem of networking plugins to handle networking for containers. We’ve been working in both the OpenStack and Kubernetes community to integrate Neutron with Kubernetes networking to enable this.

OpenShift Enterprise 3.1 vs kubernetes


OpenStack Storage for Containers
Red Hat also leverages Kubernetes storage volumes to enable users to run stateful services in containers like databases, message queues and other stateful apps. Users map their containers to persistent storage clusters, leveraging Kubernetes storage plugins like  NFS, iSCSI, Gluster, Ceph, and more. The OpenStack Cinder storage plugin currently under development will enable users to map to storage volumes managed by OpenStack Cinder.

Linux, Docker, and Kubernetes form the core of Red Hat’s enterprise container infrastructure. This LDK stack integrates with OpenStack’s compute, storage and networking services to provide an infrastructure platform for running containers. In addition to these areas, there are others that we consider critical for enterprises who are building a container-based infrastructure. A few of these include:

  • Container Security– Red Hat is working with Docker and the Open Containers community on container security. Security is commonly cited as one of the leading concerns limiting container adoption and Red Hat is tackling this on multiple levels. The first is multi-tenant isolation to help prevent containers from exploiting other containers or the underlying container host. Red Hat contributed SELinux integration to Docker, to provide a layered security model for container isolation and is also contributing to the development of features like privileged containers and user namespaces. The second area is securing container images to verify trusted content, which is another key concern. Red Hat has driven innovation in areas like image signing, scanning and certification and we recently announced our work with Black Duck to help make application containers free from known vulnerabilities
  • Enterprise Registry – Red Hat provides a standard Docker registry as a fully integrated component of both OpenShift and Atomic. This enables customers to more securely store and manage their own Docker images for enterprise deployments. Administrators can manage who has access to images, determine which images can be deployed and manage image updates.
  • Logging & Metrics – Red Hat has already integrated the ELK stack with Red Hat Enterprise Linux OpenStack Platform. It is doing the same in OpenShift and Atomic to provide users with aggregate logging for containers. This will enable administrators to get aggregated logs across the platform and also simplify log access for application developers. This work extends into integrated metrics for containerized applications and infrastructure.
  • Container Management – Red Hat CloudForms enables infrastructure and operations teams to manage application workloads across many different deployment fabrics – physical, virtual, public cloud and also private clouds based on OpenStack. CloudForms is being extended to manage container-based workloads in its next release. This will provide a single pane of glass to manage container-based workloads on OpenStack infrastructure.
Ultimately the goal of containers is to provide a better way to package and deploy your applications and enable application developers. Containers provide many benefits to developers like portability, fast deployment times and a broad ecosystem of packaged container images for a wide array of software stacks. As applications become more componentized and highly distributed with the advent of microservices architectures, containers provide an efficient way to deploy these microservices without the overhead of traditional VMs.

Red Hat OpenShift Container Platform Overview


But to provide a robust application platform and enable DevOps and Continuous Delivery, we also need to solve other challenges. Red Hat is tackling many of these in OpenShift, which is a containerized application platform that natively integrates Docker and is built on Red Hat’s enterprise container stack. These challenges include:

Build Automation– Developers moving to containerize their applications will likely need to update their build tools and processes to build container images. Red Hat is working on automating the Docker image build process at scale and has developed innovations like OpenShift source-to-image which enables users to push code changes and patches to their application containers, without being concerned with the details of Dockerfiles or Docker images.
Deployment Automation and CI/CD– Developers will also need to determine how containers will impact their deployment workflows and integrate with their CI/CD systems. Red Hat is working on automating common application deployment patterns with containers like rolling, canary and A/B deployments. We are also working to enable CI/CD with containers with work underway in OpenShift upstream projects like Origin and Fabric8
Containerized Middleware and Data Services– Administrators will need to provide their developers with trusted images to build their applications. Red Hat provides multiple language runtime images in OpenShift including Java, Node.js, Python, Ruby and more. We are also providing containerized middleware images like JBoss EAP, A-MQ and Fuse as well as database images from Red Hat’s Software Collections including MongoDB, Postgres and MySQL.
Developer Self Service – Ultimately developers want to access all of these capabilities without having to call on IT. With OpenShift, developers can access self-service Web, CLI and IDE interfaces to build and deploy containerized applications. OpenShift’s developer and application-centric view provide a great complement to OpenStack.

Containers Anywhere with OpenShift by Red Hat


This is just a sampling of the work we are doing in Containers and complements all the great work Red Hat contributes to in the OpenStack community. OpenStack and Containers are two examples of the tremendous innovation happening in open source and this week we are showcasing how they are great together.

More Information:

http://events.linuxfoundation.org/events/cloudnativecon-and-kubecon-north-america

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/getting_started_with_kubernetes/get_started_orchestrating_containers_with_kubernetes

https://www.redhat.com/en/containers/what-is-kubernetes

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html-single/getting_started_with_kubernetes/

http://rhelblog.redhat.com/tag/kubernetes/

http://redhatstackblog.redhat.com/tag/kubernetes/

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_atomic_host/7/html/getting_started_with_kubernetes/

https://www.redhat.com/en/services/training/do180-introduction-containers-kubernetes-red-hat-openshift

https://blog.openshift.com/red-hat-chose-kubernetes-openshift/

https://www.openshift.com/container-platform/kubernetes.html

https://www.openshift.com

https://cloudacademy.com/blog/what-is-kubernetes/

https://keithtenzer.com/2015/04/15/containers-at-scale-with-kubernetes-on-openstack/





DB2 12 for z/OS – The #1 Enterprise Database

$
0
0

Some IBM DB2 12 highlights:


  • Improved business insight: highly concurrent queries run up to 100x faster.
  • Faster mobile support: 6 million transactions per minute via RESTful API.
  • Enterprise scalability, reliability and availability for IoT apps: 11.7 million inserts per second, 256 trillion rows per table.
  • Reduced cost: 23 percent lower CPU cost through advanced in-memory techniques?

DB2 12 Overview Nov UG 2016 Final




Links for the above video:
https://www-01.ibm.com/support/docview.wss?uid=swg27047206#db2z12new
http://www.mdug.org/Presentations/DB2%2012%20Overview%20Nov%20UG%202016%20FINAL.pdf


Strategy and Directions for the IBM® Mainframe


Machine Learning for z/OS



Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA

IBM Z Software z14 Announcement




Throughout the development of the all new IBM z14, we have worked closely with dozens of clients around the world to understand what they need to accelerate their digital transformation, securely. What we learned was data security was foundational to everything they do, they’re striving to leverage that data to gain a competitive edge, and ultimately everybody’s trying to move faster to compete at the speed of business.

Data is the New Security Perimeter with Pervasive Encryption



Job #1 is protecting their confidential data and that of their clients from both internal and external threats.  The z14 introduces pervasive encryption as the new standard with 100% of the data encrypted at-rest and in-motion, uniquely able to bulk encrypt 100% of their data in both IBM Information Management System (IMS), IBM DB2 for z/OS, and Virtual Storage Access Method (VSAM) with no changes to their applications and no impact to the SLAs.
IBM MQ for z/OS already encrypts messages from end-to-end with its Advanced Message Security feature. On the new z14, MQ can scale to greater heights with the 7X boost in on-chip encryption performance compared to z13.

Additionally, with secure services containers, z14 can prevent data breaches by rogue administrators by restricting root access via graphical user interfaces. One of the many differentiating security features provided with IBM’s Blockchain High Security Business Network delivered in the IBM Cloud.

DB2 12 Technical Overview PART 1


DB2 12 Technical Overview PART 2



Ever-evolving Intelligence with Machine Learning

Data is the world’s next great natural resource. Our clients are looking to gain a competitive edge with the vast amounts of data they have and turn insights into actions in real time when it matters.  IBM Machine Learning for z/OS can decrease the time businesses take to continuously build, train, and deploy intelligent behavioral models by keeping the data on IBM Z where it is secure.  They can also take advantage of IBM DB2 Analytics Accelerator for z/OS’s new Zero Latency technology, which uses a just-in-time protocol for data coherency for analytic requests to train and retrain their models on the fly.

IBM Z provides the agility to continuously deliver new function via microservices, API’s or more traditional applications.
Innovate with Microservices and leverage open source.

Microservices can be built on z14 with Node.js, Java, Go, Swift, Python, Scala, Groovy, Kotlin, Ruby, COBOL, PL/I, and more.  They can be deployed in Docker containers where a single z14 can scale out to 2 million Docker containers.  These services can run up to 5X faster when co-located with the data they need on IBM Z.  The data could be existing data on DB2 or IMS or it could be using open source technologies such as MariaDB, Cassandra, or MongoDB.  On z14, a single instance of MongoDB can hold 17 TB of data without sharding!

What's new from the optimizer in DB2 12 for z/OS?



Another DB2 LUW Version 11.1 highlight is the capability to now deploy DB2 pureScale on IBM Power systems with little endien Linux operating systems. This approach works with both the vanilla Transmission Control Protocol/Internet Protocol (TCPIP)—aka sockets—as well as higher-speed Remote direct memory Access (RDMA) over Converged Ethernet (RoCE) networks. And, expectedly, it provides all of DB2 pureScale’s availability advantages, including online member recovery and rolling updates, and DB2 pureScale’s very strong scalability attributes. Here is an example of the throughput scaling experienced in the lab for an example OLTP workload running both TCP/IP, and RoCE:



What's new in the DB2 12 base release

DB2® 12 for z/OS® takes DB2 to a new level, both extending the core capabilities and empowering the future. DB2 12 extends the core with new enhancements to scalability, reliability, efficiency, security, and availability. DB2 12 also empowers the next wave of applications in the cloud, mobile, and analytics spaces.
This information might sometimes also refer to DB2 12 for z/OS as "DB2" or "Version 12."

DB2 12 for z/OS - Catch the wave early and stay ahead!


Continuous delivery and DB2 12 function levels

DB2 12 introduces continuous delivery of new capabilities and enhancements in a single service stream as soon as they are ready. The result is that you can benefit from new capabilities and enhancements without waiting for an entire new release. Function levels enable you to control the timing of the activation and adoption of new features, with the option to continue to apply corrective and preventative service without adopting new feature function.

New capabilities and enhancements in the DB2 12 base release
Most new capabilities in DB2 12 are introduced in DB2 12 function levels. However, some become available immediately in the base DB2 12 release, or when you apply maintenance.

Highlighted new capabilities in the DB2 12 base release
After the initial release, most new capabilities in DB2® 12 are introduced in DB2 12 function levels. However, some new capabilities become available immediately in the base DB2 12 release.

For information about new capabilities and enhancements in DB2 12 function levels, see What's new in DB2 12 function levels. The following sections describe new capabilities and enhancements introduced in the DB2 base (function levels 100 or 500) after general availability of DB2 12.

DevOps with DB2: Automated deployment of applications with IBM Urban Code Deploy:
With Urban Code Deploy, you can easily automate the deployment and configuration of database schema changes in DB2 11 and DB2 12. The automation reduces the time, costs, and complexity of deploying and configuring your business-critical apps, getting you to business value faster and more efficiently.

Modern language support DB2 for z/OS application development:
DB2 11 and DB2 12 now support application development in many modern programming and scripting languages. Application developers can use languages like Python, Perl, and Ruby on Rails to write DB2 for z/OS applications. Getting business value from your mainframe applications is now more accessible than ever before.

DB2 REST services improve efficiency and security:
The DB2 REST service provider, available in DB2 11 and DB2 12, unleashes your enterprise data and applications on DB2 for z/OS for the API economy. Mobile and cloud app developers can efficiently create consumable, scalable, and RESTful services. Mobile and cloud app developers can consume these services to securely interact with business-critical data and transactions, without special DB2 for z/OS expertise.

Overview of DB2 12 new function availability

The availability of new function depends on the type of enhancement, the activated function level, and the application compatibility levels of applications. In the initial DB2 12 release, most new capabilities are enabled only after the activation of function level 500 or higher.

Virtual storage enhancements
Virtual storage enhancements become available at the activation of the function level that introduces them or higher. Activation of function level 100 introduces all virtual storage enhancements in the initial DB2 12 release. That is, activation of function level 500 introduces no virtual storage enhancements.

Temporal Tables, Transparent Archiving in DB2 for z/OS and IDAA


Subsystem parameters
New subsystem parameter settings are in effect only when the function level that introduced them or a higher function level is activated. All subsystem parameter changes in the initial DB2 12 release take effect in function level 500. For a list of these changes, see Subsystem parameter changes in the DB2 12 base release.

Optimization enhancements
Optimization enhancements become available after the activation of the function level that introduces them or higher, and full prepare of the SQL statements. When a full prepare occurs depends on the statement type:

For static SQL statements, after bind or rebind of the package
For non-stabilized dynamic SQL statements, immediately, unless the statement is in the dynamic statement cache
For stabilized dynamic SQL statements, after invalidation, free, or changed application compatibility level

Activation of function level 100 introduces all optimization enhancements in the initial DB2 12 release. That is, function level 500 introduces no optimization enhancements.

SQL capabilities
New SQL capabilities become after the activation of the function level that introduces them or higher, for applications that run at the equivalent application compatibility level or higher. New SQL capabilities in the initial DB2 12 release become available in function level 500 for applications that run at the equivalent application compatibility level or higher. You can continue to run SQL statements compatibly with lower function levels, or previous DB2 releases, including DB2 11 and DB2 10.

The demands of the mobile economy and the greater need for faster business insights, combined with the explosive growth of data, present unique opportunities and challenges for companies wanting to take advantage of their mission-critical resources. Built on the proven, trusted availability, security, and scalability of DB2 11 for z/OS and the z Systems platform, the gold standard in the industry, DB2 12 gives you the capabilities needed to securely meet the business demands of mobile workloads and increased mission-critical data. It delivers world-class analytics and OLTP performance in real-time.

DB2 for z/OS delivers innovations in these key areas:

Scalable, low-cost, enterprise OLTP and analytics

DB2 12 continues to improve upon the value offered with DB2 11 with further CPU savings and performance improvements utilizing more memory optimization. Compared to DB2 11, DB2 12 clients can achieve up to 10% CPU savings for various traditional OLTP, and heavy concurrent INSERT query workloads may see higher benefits, with up to 30% CPU savings and even more benefit for select query workload utilizing UNION ALL, large sort, and selective user-defined functions (UDFs).

DB2 12 provides more cost reduction with more zIIP eligibility of DB2 REORG and LOAD utility.

DB2 12 provides deep integration with the IBM z13, offering the following benefits:

More efficient use of compression
Support for compression of LOB data (also available with the IBM zEnterprise EC12)
Faster XML parsing through the use of SIMD technology
Enhancements to compression aids DB2 utility processing by reducing elapsed time and CPU consumption with the potential to improve data and application availability. Hardware exploitation to support compression of LOB data can significantly reduce storage requirements and improve overall efficiency of LOB processing.

DB2 12 includes the new SQL TRANSFER OWNERSHIP statement, enabling better security and control of objects that contain sensitive data. In addition, DB2 12 enables system administrators to migrate and install DB2 systems while preventing access to user data.

The real-world proven, system-wide resiliency, availability, scalability, and security capabilities of DB2 and z Systems continue to be the industry standard, keeping your business running when other solutions may not. This is especially important as enterprises support dynamic mobile workloads and the explosion of data in their enterprises. DB2 12 continues to excel and extend the unique value of z Systems, while empowering the next wave of applications.

Easy access, easy scale, and easy application development for the mobile enterprise:

In-memory performance improvements

As enterprises manage the emergence of the next generation of mobile applications and the proliferation of the IoT, database management system (DBMS) performance can become a critical success factor. To that end, DB2 12 contains many features that exploit in- memory techniques to deliver world-class performance, including:

  • In-memory fast index traverse
  • Contiguous and larger buffer pools
  • Use of in-memory pipes for improved insert performance
  • Increased sort and hash in-memory to improve sort and join performance
  • Caching the result of UDFs
  • In-memory optimization in Declare Global Temporary Table (DGTT) to improve declare performance
  • In memory optimization in Resource Limit Facility to improve RLF checking

DB2 12 offers features to facilitate the successful deployment of new analytics and mobile workloads. Workloads connecting through the cloud or from a mobile device may not have the same performance considerations as do enterprise workloads. To that end, DB2 12 has many features to help ensure that new application deployments are successful. Improvements for sort-intensive workloads, workloads that use outer joins, UNION ALL, and CASE expressions can experience improved performance and increased CPU parallelism offload to zIIP.

Easy access to your enterprise systems of record

DB2 12 is used to connect RESTful web, mobile, and cloud applications to DB2 for z/OS, providing an environment for service, management, discovery, and invocation. This feature works with IBM z/OS Connect Enterprise Edition (z/OS Connect EE,5655-CEE) and other RESTful providers to provide a RESTful solution for REST API definition and deployment.

The IBM Data Studio product, which can be used as the front-end tooling to create, deploy, or remove DB2 for z/OS services, is supported. Alternatively, new RESTful management services and BIND support are provided to manage services created in DB2 for z/OS. This capability was first made available in the DB2 Adapter for z/OS Connect feature of the DB2 Accessories Suite for z/OS, V3.3 (5697-Q04) product, working with both DB2 10 for z/OS and DB2 11 for z/OS.

Overview of DB2 features

DB2 12 for z/OS consists of the base DB2 product with a set of optional separately orderable features. Select features QMF Enterprise Edition V12 and QMF Classic Edition V12 are also made available as part of DB2 11 for z/OS (5615-DB2). Some of these features are available at no additional charge and others are chargeable:

Chargeable features for QMF V12 (features of DB2 12 for z/OS and DB2 11 for z/OS)

QMF Enterprise Edition provides a complete business analytics solution for enterprise-wide business information across end-user and database platforms. QMF Enterprise Edition consists of the following capabilities:

  • QMF for TSO and CICS
  • QMF Enhanced Editor (new)
  • QMF Analytics for TSO
  • QMF High Performance Option (HPO)
  • QMF for Workstation
  • QMF for WebSphere
  • QMF Data Service, including QMF Data Service Studio (new)
  • QMF Vision (new)

New enhancements for each capability are as follows:

QMF for TSO and CICS has significant improvements for the QMF for TSO/CICS client.

The QMF process of saving database tables, traditionally accomplished through the QMF SAVE DATA command, has been enhanced. QMF SAVE DATA intermediate results can now be saved to IBM DB2 Analytics Accelerator for z/OS 'Accelerator-only tables'. The ability to save intermediate results in Accelerator-only tables is also available for the command IMPORT TABLE and the new QMF RUN QUERY command with the TABLE keyword. This exploitation of the Accelerator may result in benefits such as improved performance, reduced batch window allocation for QMF applications, and reduced storage requirements.
By using the new TABLE keyword on the RUN QUERY command, you can now save data, using the SAVE DATA command, without needing to return and complete a data object. The RUN QUERY command with the TABLE keyword operates completely within the database to both retrieve data and insert rows without returning a report to the user.
Usability of the TSO client is improved by the enhanced editor feature (see the QMF Enhanced Editor section for more detail.).
Both the TSO and CICS clients now have the ability to organize queries, procedures, forms, and analytics into groups called folders, aiding in productivity and usability. QMF commands such as LIST, SAVE, ERASE, and RENAME have been updated to work with folders.
QMF TSO and CICS clients now have additional report preview options. After proper setting of the DSQDC_DISPLAY_RPT global variable, users will be able to enter a report mini-session, where queries can be run to view potential output without actually committing the results. The report mini-session can be useful for running and testing SELECT with change type queries. Upon exiting the report mini-session, the user will be prompted to COMMIT or ROLLBACK the query.
With Version 12, QMF's TSO and CICS clients deliver significant performance and storage improvements.
Using the new QMF program parameter option DSQSMTHD, users can make use of a second data base thread. The second thread is to be used for RUN QUERY and DISPLAY TABLE command processing. Usage of a second data base thread can assist with performance issues on SAVE operations with an incomplete report outstanding. Additionally, usage of the second thread can reduce storage requirements for SAVE DATA commands on large report objects because rows will not need to reside in storage but can be retrieved from the data base and inserted into the new table as needed.
Using the DSQEC_BUFFER_SIZE global variable, the QMF internal storage area used to fetch data base row data can be increased. By changing the default from 4 kilobytes to a value up to 256 kilobytes, QMF can increase the amount of data fetched in a single call to the data base. Less calls to the data base reduces the amount of time it takes to complete the report, which can result in significant performance improvements.
QMF's TSO and CICS clients now integrates with QMF Data Service, enabling users of this interface to access a broader range of data sources. The support enables access to z/OS and non-z/OS data sources, including relational and nonrelational data sources (see the QMF Data Service section for a description of accessible data types). This capability is available only through QMF Enterprise Edition.
QMF Enhanced Editor (new) provides usability improvements to the TSO client by bringing customizable highlighting and formatting for SQL syntax, reserved words, functions, and data types, and parenthesis checking. The new query assist feature provides table name suggestions, column name and data type information, and suggested column value information, plus a preview pane.

QMF Analytics for TSO has been enhanced as follows:

Three new statistics models have been added: Wilcoxon Signed-Rank Test, Mann-Whitney U Test, and the F-Test model.
A user-defined mapping capability has been added. OpenGIS WKT map definitions are available in either DB2 tables or exported data sets, which can be read to format user-specific maps.
Maps for Africa, North America, South America, and Germany have been added to the existing library of predefined maps.
The ability to choose columns for use in analytical analyses has been improved with enhanced data type targeting and information.
Mouse (graphics cursor) support is added for quicker interaction with the QMF Analytics for TSO functionality.
Saving analytics has been updated to display a list of existing analytics objects.
QMF for Workstation and QMF for WebSphere add additional support for DB2 Analytics Accelerator and enable QMF objects to be used as virtual tables in QMF Data Service.

Administrators now have the ability to specify whether the DB2 Analytics Accelerator should be used by QMF users when available (by database and query) through new resource limit options on the data source or object.
QMF Workstation and QMF for WebSphere can now write data to the DB2 Analytics Accelerator. Data can be saved as Accelerator-only tables or Accelerator-shadow tables. Queries could then be created against this data, enabling them to take advantage of the DB2 Analytics Accelerator.
QMF will detect DB2 Analytics Accelerator appliances and display these appliances under the data source. Users can also see tables that exist on the DB2 Analytics Accelerator and even add additional tables to the DB2 Analytics Accelerator by dragging and dropping tables into the appliance folder.
QMF-prepped data will be made accessible as virtual tables or stored procedures to external applications through data service connectors such as:
Mainframe Data Service for Apache Spark on z/OS
Rocket DV
Rocket Mainframe Data Service on IBM Bluemix
IBM DB2 Analytics Accelerator Loader
QMF Data Service enables DB2 QMF to access numerous data sources and greatly eliminates the need to move data in order to perform your analytics. It enables you to obtain real-time analytics insights using a high-performance in-memory mainframe solution.

The need for real-time information requires a high-performance data architecture that can handle the extreme volumes and unique requirements of mainframe data and that is transparent to the business user. DB2 QMF 's new data service includes several query optimization features, such as parallel I/O and MapReduce. Multiple parallel threads handle input requests, continually streaming and buffering data to the client. The mainframe MapReduce technology greatly reduces the elapsed time of the query by accessing the database with multiple threads that read the file in parallel.

Data definitions and schema information are extracted from a variety of places to create virtual tables. All of the implementation details are hidden to the user, presented instead as a single logical data source. The logical data source is easily administered through the new Eclipse-based QMF Data Service Studio. With QMF Data Service Studio, DB2 QMF now supports a broader range of data sources, including:

Mainframe: Relational/nonrelational databases and file structures: ADABAS, DB2,VSAM, and Physical Sequential; CICS and IMS
Distributed: Databases running on Linux, UNIX, and Microsoft Windows platforms: DB2, Oracle, Informix, Derby, and SQL Server
Cloud and big data: Cloud-based relational and nonrelational data, and support for Hadoop
Data prepared in QMF will be made accessible as virtual tables to external applications through Data Service connectors such as:

Mainframe Data Service for Apache Spark on z/OS
Rocket DV
Rocket Mainframe Data Service on IBM Bluemix
DB2 Analytics Accelerator Loader
QMF Vision (new) is a web client visualization interface that enables you to create, modify, and drill down on data visualizations that are displayed on a dashboard. Users have the ability to drag and drop whatever dimensions or measures are needed, or add more variables for increased drill-down capability. Column, pie, treemap, geo map, line,scatter charts and many more chart objects are available. This gives a business user the ability to analyze data and provide insights that might not be readily apparent.

The most commonly requested guided analytics capabilities, such as outlier detection and cardinality, are now provided out of the box. These capabilities are integrated into the architecture for an intuitive analysis experience. For one-off decision making, you can quickly create simple reports using the tabular chart, which gives you a line-by-line view of summary data. Reports can be formatted to produce multilevel grouping, hierarchical structures, and dynamic cross tabulations, all for greater readability.

This enhancement simplifies the sharing of insights and collaboration with other users. Dashboards can be dropped into the chat window and other users can immediately start collaborating. They can discuss performance results, strategy, and opportunities and discover new insights about the data. Users can connect to new data sources as well as work with existing QMF queries and tables.

QMF Classic Edition supports users working entirely on traditional mainframe terminals and emulators, including IBM Host On Demand, to access DB2 databases. QMF Classic Edition consists of the following capabilities in V12:

QMF for TSO and CICS
QMF Enhanced Editor
QMF Analytics for TSO
QMF High Performance Option (HPO)

Get the most out of DB2 for z/OS with modern application development language support



More Information:

http://www.idug.org/p/bl/et/blogid=278&blogaid=593

https://www-01.ibm.com/support/docview.wss?uid=swg27047206

http://www.ibmsystemsmag.com/Blogs/DB2utor/October-2016/Thoughts-on-DB2-12/

http://www.idug.org/p/bl/et/blogid=477&blogaid=495

http://www.ibmbigdatahub.com/blog/new-ibm-db2-release-simplifies-deployment-and-key-management

https://developer.ibm.com/mainframe/2017/07/17/ibm-z-software-z14-announcement/

https://www-03.ibm.com/press/us/en/pressrelease/52805.wss

https://www.ibm.com/us-en/marketplace/z14

https://www-03.ibm.com/systems/z/solutions/enterprise-security.html

https://www.youtube.com/user/IBMDB2forzOS

http://www-01.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_sm/2/760/ENUS5650-DB2/index.html&request_locale=en

http://www-01.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_sm/2/760/ENUS5650-DB2/index.html&request_locale=en

https://www.ibm.com/analytics/us/en/technology/db2/db2-12-for-zos.html

https://www.ibm.com/support/knowledgecenter/SSEPEK_12.0.0/java/src/tpc/imjcc_rjv00010.html

https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.7.0/com.ibm.db2.luw.qb.server.doc/doc/r0008865.html

https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=ca&infotype=an&appname=iSource&supplier=897&letternum=ENUS216-378

http://ibmsystemsmag.com/Blogs/DB2utor/January-2017/DB2-12-REORG-Enhancements/

http://www.ibmsystemsmag.com/Blogs/DB2utor/November-2015/IBM-Launching-DB2-12-for-z-OS-ESP/

https://www.facebook.com/Db2community/

https://www.facebook.com/Db2community/videos/10154713389235872/

https://www.ibm.com/analytics/us/en/events/machine-learning/?cm_mmc=OSocial_Twitter-_-Analytics_Database+-+Data+Warehousing+-+Hadoop-_-WW_WW-_-Twitter+organic&cm_mmca1=000000TA&cm_mmca2=10000659&

https://www-03.ibm.com/services/learning/ites.wss/zz-en?pageType=course_description&cc=&courseCode=CL206G

Oracle Database 12c Release 2

$
0
0

Oracle Database 12c Release 2 (12.2), is now available everywhere 

Ask Tom Answer Team (Connor McDonald and Chris Saxon) on Oracle Database 12c Release 2 New Features



Oracle Database 12.2c Architecture Diagram
http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/12c/r1/poster/OUTPUT_poster/pdf/Database%20Architecture.pdf


The latest generation of the world's most popular database, Oracle Database 12c Release 2 (12.2), is now available everywhere - in the Cloud, with Oracle Cloud at Customer, and on-premises.  This latest release provides organizations of all sizes with access to the world’s fastest, most scalable and reliable database technology in a cost-effective, hybrid Cloud environment. 12.2 also includes a series of innovations that helps customers easily transform to the Cloud while preserving their investments in Oracle Database technologies, skills and resources.

Oracle RAC 12c Release 2 New Features

Database Security - Comprehensive Defense in Depth

Partner Webcast – Oracle Identity Cloud Service: Introducing Secure, On-Demand Identity Management



Oracle Database 12c provides multi-layered security including controls to evaluate risks, prevent unauthorized data disclosure, detect and report on database activities and enforce data access controls in the database with data-driven security. Oracle Database 12c Release 2 (12.2), now available in the Cloud and on-premises, introduces new capabilities such as on-line and off-line tablespace encryption and database privilege analysis. Combined with Oracle Key Vault and Oracle Audit Vault and Database Firewall, Oracle Database 12c provides unprecedented defense-in-depth capabilities to help organizations address existing and emerging security and compliance requirements.

Partner Webcast – Enabling Oracle Database High Availability and Disaster Recovery with Oracle Cloud


Database Cloud Services

Oracle Cloud provides several Oracle Cloud Service deployment choices. These choices allow you to start at the cost and capability level suitable to your use case and then gives you the flexibility to adapt as your requirements change over time. Choices include: single schemas, dedicated pluggable databases, virtualized databases, bare metal databases and databases running on world class engineered infrastructure.

The Oracle Exadata Cloud Service offers the largest most business-critical database workloads a place to run in Oracle Cloud. With all the infrastructure components including hardware, networking, storage, database and virtualization in place, access to secured, highly available and powerful performance is easily provisioned in a few clicks. Exadata Cloud Service is engineered to support OLTP, Data Warehouse / Real-Time Analytic and Mixed database workloads at any scale. With this service, you maintain control of your database while Oracle manages the hardware, storage and networking infrastructure letting you focus on growing your business.



https://cloud.oracle.com/database

Oracle Database Exadata Cloud Machine delivers the world’s most advanced database cloud to customers who require their databases to be located on-premises. Exadata Cloud Machine uniquely combines the world’s #1 database technology and Exadata, the most powerful database platform, with the simplicity, agility and elasticity of a cloud-based deployment. It is identical to Oracle’s Exadata public cloud service, but located in customers’ own data centers and managed by Oracle Cloud Experts. Every Oracle Database and Exadata feature and option is included with the Exadata Cloud Machine subscription, ensuring highest performance, best availability, most effective security and simplest management. Databases deployed on Exadata Cloud Machine are 100% compatible with existing on-premises databases, or databases that are deployed in Oracle’s public cloud. Exadata Cloud Machine is ideal for customers desiring cloud benefits but who cannot move their databases to the public cloud due to sovereignty laws, industry regulations, corporate policies, or organizations that find it impractical to move databases away from other tightly coupled on-premises IT infrastructure.

Oracle Database 12c Release 2 Sharded Database Overview and Install (Part 1)


Oracle Sharding Part 2


Oracle Sharding Part 3


Oracle Sharding with Suresh Gandhi

Overview of Oracle‘s Big Data Management System

As today's enterprises embrace big data, their information architectures are evolving. The new information architecture in the big data era embraces emerging technologies such as Hadoop, but at the same time leverages the core strengths of previous data warehouse architectures.

Partner Webcast – Oracle Ravello Cloud Service: Easy Deploying of Big Data VM on Cloud



The data warehouse, built upon Oracle Database 12c Release 2 and Exadata, will continue to be the primary analytic database for storing core transactional data: financial records, customer data, point- of-sale data and so forth (see Key Data Warehousing and Big Data Capabilities for more information).

However, the data warehouse will be augmented by a big-data system (built upon Oracle Big Data Appliance), which functions as a ‘data reservoir’. This will be the repository for the new sources of large volumes of data: machine-generated log files, social-media data, and videos and images -- as well as a repository for more granular transactional data or older transactional data which is not stored in the data warehouse.

Data flows between the big data system and the data warehouse to create a unified foundation: the Oracle Big Data Management System.

The transition from the Enterprise Data Warehouse centric architecture to the Big Data Management System - both on-premise, on the Cloud, or in hybrid Cloud systems - is going to revolutionize any companies information management architecture. Oracle's Statement of Direction outlines Oracle's vision for delivering innovative new technologies for building the information architecture of tomorrow.

Partner Webcast – Docker Agility in Cloud: Introducing Oracle Container Cloud Service





Big data is in many ways an evolution of data warehousing. To be sure, there are new technologies used for big data, such as Hadoop and NoSQL databases. And the business benefits of big data are potentially revolutionary. However, at its essence, big data requires an architecture that acquires data from multiple data sources, organizes and stores that data in a suitable format for analysis, enables users to efficiently analyze the data and ultimately helps to drive business decisions. These are the exact same principles that IT organizations have been following for data warehouses for years.




The new information architecture that enterprises will pursue in the big data era is an extension of their previous data warehouse architectures. The data warehouse, built upon a relational database, will continue to be the primary analytic database for storing much of a company’s core transactional data, such as financial records, customer data, and sales transactions. The data warehouse will be augmented by a big-data system, which functions as a ‘data lake’. This will be the repository for the new sources of large volumes of data: machine-generated log files, social- media data, and videos and images -- as well as the repository for more granular transactional data or older transactional data which is not stored in the data warehouse. Even though the new information architecture consists of multiple physical data stores (relational, Hadoop, and NoSQL), the logical architecture is a single integrated data platform, spanning the relational data warehouse and the Hadoop-based data lake.

Technologies such as Oracle Big Data SQL make this distributed architecture a reality; Big Data SQL1 provides data virtualization capabilities, so that SQL can be used to access any data, whether in relational databases or Hadoop or NoSQL. This virtualized SQL layer also enables many other languages and environments, built on top of SQL, to seamlessly access data across the entire big data platform.

Oracle Database 12c Release 2 and Oracle Exadata: A Data Warehouse as a Foundation for Big Data

Even as new big data architectures emerge and mature, business users will continue to analyze data by directly leveraging and accessing data warehouses. The rest of this paper describes how Oracle Database 12c Release 2 provides a comprehensive platform for data warehousing that combines industry-leading scalability and performance, deeply-integrated analytics, and advanced workload management – all in a single platform running on an optimized hardware configuration.


Hot cloning and refreshing PDBs in Oracle 12cR2

Exadata

The bedrock of a solid data warehouse solution is a scalable, high-performance hardware infrastructure. One of the long-standing challenges for data warehouses has been to deliver the IO bandwidth necessary for large-scale queries, especially as data volumes and user workloads have continued to increase. While the Oracle Exadata Database Machine is designed to provide the optimal database environment for every enterprise database, the Exadata architecture also provides a uniquely optimized storage solution for data warehousing that delivers order-of- magnitude performance gains for large-scale data warehouse queries and very efficient data storage via compression for large data volumes. A few of the key features of Exadata that are particularly valuable to data warehousing are:

  • » Exadata Smarts Scans. With traditional storage, all database intelligence resides on the database servers. However, Exadata has database intelligence built into the storage servers. This allows database operations, and specifically SQL processing, to leverage the CPU’s in both the storage servers and database servers to vastly improve performance. The key feature is “Smart Scans”, the technology of offloading some of the data-intensive SQL processing into the Exadata Storage Server: specifically, row-filtering (the evaluation of where-clause predicates) and column-filtering (the evaluation of the select-list) are executed on Exadata storage server, and a much smaller set of filtered data is returned to the database servers. “Smart scans” can improve the query performance of large queries by an order of magnitude, and in conjunction with the vastly superior IO bandwidth of Exadata’s architecture delivers industry-leading performance for large-scale queries.
  • » Exadata Storage Indexes. Completely automatic and transparent, Exadata Storage Indexes maintain each column’s minimum and maximum values of tables residing in the storage server. With this information, Exadata can easily filter out unnecessary data to accelerate query performance.
  • » Hybrid Columnar Compression. Data can be compressed within the Exadata Storage Server into a highly efficient columnar format that provides up to a 10 times compression ratio, without any loss of query performance. And, for pure historical data, a new archival level of hybrid columnar compression can be used that provides up to 40 times compression ratios.

Partner Webcast - Oracle Cloud Machine Technical Overview (Part1)



Partner Webcast - Oracle Cloud Machine Technical Overview (Part 2)


Oracle Database In-Memory

While Exadata tackles one major requirement for high-performance data warehousing (high-bandwidth IO), Oracle Database In-Memory tackles another requirement: interactive, real-time queries. Reading data from memory can be orders of magnitude faster than reading from disk, but that is only part of the performance benefits of In-Memory: Oracle additionally increases in-memory query performance through innovative memory-optimized performance techniques such as vector processing and an optimized in-memory aggregation algorithm. Key features include:

  • » In-memory (IM) Column Store. Data is stored in a compressed columnar format when using Oracle Database In-Memory. A columnar format is ideal for analytics, as it allows for faster data retrieval when only a few columns are selected from a table(s). Columnar data is very amenable to efficient compression; in-memory data is typically compressed 2-20x, which enables larger volumes of raw data to be stored in the in-memory column store.
  • » SIMD Vector Processing. When scanning data stored in the IM column store, Database In-Memory uses SIMD vector processing (Single Instruction processing Multiple Data values). Instead of evaluating each entry in the column one at a time, SIMD vector processing allows a set of column values to be evaluated together in a single CPU instruction. In this way, SIMD vector processing enables the Oracle Database In-Memory to scan and filter billion of rows per second.
  • » In-Memory Aggregation. Analytic queries require more than just simple filters and joins. They require complex aggregations and summaries. Oracle Database In-Memory provides an aggregation algorithm specifically optimized for the join-and-aggregate operations found in typical star queries. This algorithm allows dimension tables to be joined to the fact table, and the resulting data set aggregated, all in a single in-memory pass of the fact table.

Oracle Database In-Memory is useful for every data-warehousing environment. Oracle Database In-Memory is entirely transparent to applications and tools, so that it is simple to implement. Unlike a pure in-memory database, not all of the objects in an Oracle database need to be populated in the IM column store. The IM column store should be populated with the most performance-critical data, while less performance-critical data can reside on lower cost flash or disk. Thus, even the largest data warehouse can see considerable performance benefits from In- Memory.

Query Performance

Oracle provides performance optimizations for every type of data warehouse environment. Data warehouse workloads are often complex, with different users running vastly different operations, with similarly different expectations and requirements for query performance. Exadata and In-Memory address many performance challenges, but many other fundamental performance capabilities are necessary for enterprise-wide data warehouse performance.
Oracle meets the demands of data warehouse performance by providing a broad set of optimization techniques for every type of query and workload:

  • » Advanced indexing and aggregation techniques for sub-second response times for reporting and dashboard queries. Oracle’s bitmap and b-tree indexes and materialized views provide the developer and DBA’s with tools to make pre-defined reports and dashboards execute with fast performance and minimal resource requirements.
  • » Star query optimizations for dimensional queries. Most business intelligence tools have been optimized for star- schema data models. The Oracle Database is highly optimized for these environments; Oracle Database In- Memory provides fast star-query performance leverage its in-memory aggregation capabilities. For other database environments, Oracle’s “star transformation” leverages bitmap indexes on the fact table to efficiently join multiple dimension tables in a single processing step. Meanwhile, Oracle OLAP is a complete multidimensional analytic engine embedded in the Oracle Database, storing data within multidimensional cubes inside the database accessible via SQL. The OLAP environment provides very fast access to aggregate data in a dimensional environment, in addition to sophisticated calculation capabilities (the latter is discussed in a subsequent section of this paper).
  • » Scalable parallelized processing. Parallel execution is one of the fundamental database technologies that enable users to query any amount of data volumes. It is the ability to apply multiple CPU and IO resources to the execution of a single database operation. Oracle’s parallel architecture allows any query to be parallelized, and Oracle dynamically chooses the optimal degree of parallelism for every query based on the characteristics of the query, the current workload on the system and the priority of requesting user.
  • » Partition pruning and partition-wise joins. Partition pruning is perhaps one of the simplest query-optimization techniques, but also one of the most beneficial. Partition pruning enables a query to only access the necessary partitions, rather than accessing an entire table – frequently, partition-pruning alone can speed up a query by two orders of magnitude. Partition-wise joins provide similar performance benefits when joining tables that are partitioned by the same key. Together these partitioning optimizations are fundamental for accelerating performance for queries on very large database objects.

Oracle Database 12c Release 2 Rapid Home Provisioning and Maintenance


The query performance techniques described here operate in a concerted fashion, and provide multiplicative performance gains. For example, a single query may be improved by 10x performance via partition-pruning, by 5x via parallelism, by 20x via star query optimization, and by 10x via Exadata smart scans – a net improvement of 10,000x compared to a naïve SQL engine.
Orchestrating the query capabilities of the Oracle database are several foundational technologies. Every query running in a data warehouse benefit from:

  • » A query optimizer that determines the best strategy for executing each query, from among all of the available execution techniques available to Oracle. Oracle’s query optimizer provides advanced query-transformation capabilities, and, in Oracle Database 12c, the query optimizer adds Adaptive Query Optimization, which enables the optimizer to make run-time adjustments to execution plans.
  • » A sophisticated resource manager for ensuring performance even in databases with complex, heterogeneous workloads. The Database Resource Manager allows end-users to be grouped into ‘resource consumer groups’, and for each group, the database administrator can set policies to govern the amount of CPU and IO resources that can be utilized, as well as specify policies for proactive query governing, and for query queuing. With the Database Resource Manager, Oracle provides the capabilities to ensure that data warehouse can address the requirements of multiple concurrent workloads, so that a single data warehouse platform can, for example, simultaneously service hundreds on online business analysts doing ad hoc analysis in a business intelligence tool, thousands of business users viewing a dashboard, and dozens of data scientists doing deep data exploration.
  • » Management Packs to automate the ongoing performance tuning of a data warehouse. Based upon the ongoing performance and query workload, management packs provide recommendations for all aspects of performance, including indexes and partitioning.

More Information:

http://www.oracle.com/technetwork/database/enterprise-edition/downloads/index.html

http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/12c/r1/poster/OUTPUT_poster/poster.html

https://docs.oracle.com/en/database/

https://docs.oracle.com/database/122/ADMIN/title.htm

http://docs.oracle.com/database/121/CNCPT/cdbovrvw.htm#CNCPT89234

https://docs.oracle.com/database/122/whatsnew.htm

https://docs.oracle.com/database/122/NEWFT/title.htm

https://docs.oracle.com/database/122/NEWFT/toc.htm

https://docs.oracle.com/database/122/INMEM/title.htm

http://www.oracle.com/technetwork/database/enterprise-edition/downloads/oracle12c-windows-3633015.html

https://docs.oracle.com/database/122/LADBN/toc.htm#LADBN-GUID-2404CE5F-6894-4B26-9213-8A47DC262109

http://www.oracle.com/us/corporate/analystreports/ovum-cloud-first-strategy-oracle-db-3520721.pdf

The NEW Oracle Database Appliance Portfolio   https://go.oracle.com/LP=55375?elqCampaignId=52477&src1=ad:pas:go:dg:oda&src2=wwmk160603p00096c0015&SC=sckw=WWMK160603P00096C0015&mkwid=sFw6OzrF5%7Cpcrid%7C215765003921%7Cpkw%7Coracle%20database%7Cpmt%7Cp%7Cpdv%7Cc%7Csckw=srch:oracle%20database

https://broadcast.oracle.com/odatouchcastEN

Oracle Database 12c Release 2 - Get Started with Oracle Database   https://docs.oracle.com/database/122/index.htm

http://www.oracle.com/technetwork/database/security/overview/index.html

http://www.oracle.com/technetwork/database/bi-datawarehousing/data-warehousing-wp-12c-1896097.pdf

http://www.oracle.com/technetwork/database/upgrade/overview/upgrading-oracle-database-wp-122-3403093.pdf

http://www.oracle.com/technetwork/database/upgrade/overview/index.html

Oracle Sparc M8 and Oracle Advanced Analytics

$
0
0


Oracle SPARC M8 released with 32 cores 256 threads 5.0GHz

Oracle announced its eighth generation SPARC platform, delivering new levels of security capabilities, performance, and availability for critical customer workloads. Powered by the new SPARC M8 microprocessor, new Oracle systems and IaaS deliver a modern enterprise platform, including proven Software in Silicon with new v2 advancements, enabling customers to cost-effectively deploy their most critical business applications and scale-out application environments with extreme performance both on-premises and in Oracle Cloud.

Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map; Bigger, Better, Faster, More!



SPARC M8 processor-based systems, including the Oracle SuperCluster M8 engineered systems and SPARC T8 and M8 servers, are designed to seamlessly integrate with existing infrastructures and include fully integrated virtualization and management for private cloud. All existing commercial and custom applications will run on SPARC M8 systems unchanged with new levels of performance, security capabilities, and availability. The SPARC M8 processor with Software in Silicon v2 extends the industry's first Silicon Secured Memory, which provides always-on hardware-based memory protection for advanced intrusion protection and end-to-end encryption and Data Analytics Accelerators (DAX) with open API's for breakthrough performance and efficiency running Database analytics and Java streams processing. Oracle Cloud SPARC Dedicated Compute service will also be updated with the SPARC M8 processor.

Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad Carlile



"Oracle has long been a pioneer in engineering software and hardware together to secure high-performance infrastructure for any workload of any size," said Edward Screven, chief corporate architect, Oracle. "SPARC was already the fastest, most secure processor in the world for running Oracle Database and Java. SPARC M8 extends that lead even further."



The SPARC M8 processor offers security enhancements delivering 2x faster encryption and 2x faster hashing than x86 and 2x faster than SPARC M7 microprocessors. The SPARC M8 processor's unique design also provides always-on security by default and built-in protection of in-memory data structures from hacks and programming errors.



SPARC M8's silicon innovation provides new levels of performance and efficiency across all workloads, including: 
  • Database: Engineered to run Oracle Database faster than any other microprocessor, SPARC M8 delivers 2x faster OLTP performance per core than x86 and 1.4x faster than M7 microprocessors, as well as up to 7x faster database analytics than x86.
  • Java: SPARC M8 delivers 2x better Java performance than x86 and 1.3x better than M7 microprocessors.  DAX v2 produces 8x more efficient Java streams processing, improving overall application performance.
  • In Memory Analytics: Innovative new processor delivers 7x Queries per Minute (QPM)/core than x86 for database analytics.
Oracle is committed to delivering the latest in SPARC and Solaris technologies and servers to its global customers. Oracle's long history of binary compatibility across processor generations continues with M8, providing an upgrade path for customers when they are ready. Oracle has also publicly committed to supporting Solaris until at least 2034.

Oracle Sparc M8 is available for:

  • Oracle SPARC M8
  • Oracle SPARC T8-1 server
  • Oracle SPARC T8-2 server
  • Oracle SPARC T8-4 server
  • Oracle SPARC M8-8 server
  • Oracle SuperCluster M8 engineered system

More information in: Oracle SPARC M8 Launch Webcast:  http://www.oracle.com/us/corporate/events/next-gen-secure-infrastructure-platform/index.html

About Oracle 

The Oracle Cloud offers complete SaaS application suites for ERP, HCM and CX, plus best-in-class database Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) from data centers throughout the Americas, Europe and Asia. For more information about Oracle (NYSE: ORCL), please visit us at oracle.com.

Big data analytics using oracle advanced analytics and big data sql



The Oracle SPARC M8 is now out and is a monster of a chip. Each SPARC M8 processor supports up to 32 cores and 64MB L3 cache. Each core can handle 8 threads for up to 256 threads. Compare this to the AMD EPYC 7601, the world’s only 32 core x86 processor as of this writing, which handles 64 threads and also has 64MB L3 cache. The cores can also clock up to 5.0GHz faster than current x86 high-core count server chip designs from Intel and AMD. That is quite astounding given the SPARC M8 is still using 20nm process technology.

Beyond simple the simple core specs, there is much more going on. Oracle has specific accelerators for cryptography, JAVA performance, database performance and ETC. For example, there are 32 on-chip Data Analytics Accelerator (DAX) engines. DAX engines offload query processing and perform real-time data decompression. Oracle’s software business for the Oracle Database line is still strong and these capabilities are what is often referred to as “SQL in Silicon.” Oracle claims that Oracle Database 12c is up to 7 times faster by using M8 with DAX than competing CPUs. That is a big deal for software licensing costs. Another interesting feature is the inline decompression feature allows decompression of data stored in memory with no claimed performance penalty.

Oracle SPARC M8 Processor Key Specifications

Here are the key specs for the new Oracle SPARC CPUs:


  • 32 SPARC V9 cores, maximum frequency: 5.0 GHz
  • Up to 256 hardware threads per processor; each core supports up to 8 threads
  • Total of 64 MB L3 cache per processor, 16-way set-associative and inclusive of all inner caches
  • 128 KB L2 data cache per core; 256 KB L2 instruction cache shared among four cores
  • 32 KB L1 instruction cache and 16 KB L1 data cache per core
  • Quad-issue, out-of-order integer execution pipelines, one floating-point unit, and integrated cryptographic stream processing per core
  • Sophisticated branch predictor and hardware data prefetcher
  • 32 second-generation DAX engines; 8 DAX units per processor with four pipelines per DAX unit
  • Encryption instruction accelerators in each core with direct support for 16 industry-standard cryptographic algorithms plus random-number generation: AES, Camellia, CRC32c, DES, 3DES, DH, DSA, ECC, MD5, RSA, SHA-1, SHA-3, SHA-224, SHA-256, SHA-384, and SHA-512
  • 20 nm process technology
  • Open Oracle Solaris APIs available for software developers to leverage the Silicon Secured Memory and DAX technologies in the SPARC M8 processor
  • On Solaris Support Until 2034


In the official Oracle SPARC M8 release, Oracle has a note that is a clear nod to its Organizationals changes (we mentioned in a recent Oracle server release.)

Oracle is committed to delivering the latest in SPARC and Solaris technologies and servers to its global customers. Oracle’s long history of binary compatibility across processor generations continues with M8, providing an upgrade path for customers when they are ready. Oracle has also publicly committed to supporting Solaris until at least 2034.

Oracle is clearly hearing from its customers about the mass layoffs of Solaris engineering teams.

New Oracle SPARC M8 Systems

There are five new SPARC V9 systems are available from Oracle today:

  • Oracle SPARC T8-1 server
  • Oracle SPARC T8-2 server
  • Oracle SPARC T8-4 server
  • Oracle SPARC M8-8 server
  • Oracle SuperCluster M8 engineered system

The Evolution and Future of Analytics

We live in a world where things around us are ever changing.



Measurement metrics are just in time, predictive and need a lot of augmented intelligence; however, we're developing more complex mind analytics when it comes to buying patterns.

This new type of analytics can give us insight into how the customer feels and what he or she experiences.

Oracle's Machine Learning & Advanced Analytics 12.2 & Oracle Data Miner 4.2 New Features


Thus, the availability of smart information will emerge.

In the future, you may walk into a store and find one or all of the below, which can be built as solutions:

a) A robot welcoming you and taking over to interact with you using connected back end and analytics.

b) Natural language or human analytics that can automatically read your mood to ultimately improve customer satisfaction.

c) Historical data about you as a customer to help up sell or cross sell products based on your interests.

d) Automatic analysis about what you're doing to bring near real-time context of data; this will enable the retailer to build a mobile based intuitive presence or no billing architecture.

e) A personal assistant model to better serve you as a customer, empowering retailers to provide solutions to unsure customers.

f) IN product or things analytics to provide information about the product that makes things intelligent through RFID, intelligent tagging, sensors etc.

g) Discounts/coupons based on mixing historical buying patterns; post purchase analytics.

h) Interactive dashboards that make augmented decisions about a few areas based on reviews; this would take expert reviews, phone calls, product management and more into account.

i) A store platform of grammar, syntax, semantics and data science grammar to create recurring patterns, challenges and build new solutions which are continuous in nature.

Based on the above, let's dive into different types of analytics available on the market. We'll look at how they will blend and intersect to develop more augmented applications for the future.

Insights into Real-world Data Management Challenges





1) Historical Analytics

This is the traditional analytics of business intelligence focused on analyzing stored data and reporting. We would build repositories and create analyses and dashboards for historical data. Solutions would include Oracle Business Intelligence.

2) Current Analytics 

Here the analytics is measurement over current process. For example, we would measure the effectiveness of a process as it happens (business activity monitoring) using a stream that processes arriving data and analyzes it in real-time.

3) Enterprise Performance Management

Here the objective is to focus on projections/what-if analysis with the current data and make projections for the future. An example would be a Hyperion or an EPM based solution which could help derive and plan reporting as projections. EPM today is also available as a cloud service.

4) Predictive Analytics

With the Big Data market growing, and with unstructured data adding parameters of velocity, variety and volume, the data world is moving on to more predictive analytics, with a blended mix of data. There is one world of data in the hadoop world and another in the classical data warehouse world. We can mix and match and do Big Data analytics.

Predictive analytics is more of a compass-like decision making with data analysis patterns. Oracle has an end-to-end Big Data solution from DW, Hadoop and analytics that can help develop predictive solutions.

MSE 2017 - Customer Voted Session: Rocketing Your Knowledge Management Success with Analytics


5) Prescriptive Analytics

To extend the predictive analytics, we would also develop systems to make decisions once we have the prediction; i.e. sending emails and connecting systems as the patterns are detected. This is the basics of building more heuristics systems to make decisions about arrived patterns.

6) Machine Analytics 

Every device and machine is going to generate data. Machine analytics is a blended form of data that can be embedded into the standard source to enhance and improve the overall data pattern. Oracle provides IOT CS as a solution to connect, analyze and integrate data from various machines and enrich new applications like  ERP, CRM and more.

Oracle Analytics and Big Data: Unleash the Value



7) AI Based Analytics

AI or deep learning is the next gen analytics pattern where we can train the systems or any entity to think and then embed the analytics pattern in the solution.

8) IORT / Robotics Analytics

With Robots / Bots and personal assistant complementing solutions, there are a lot of patterns of thinking and execution distributed to multiple systems. IORT or robotics analytics is a new branch that will focus on how we can analyze the pattern from semi thinking devices.

9) Data Science as Service 

A new branch where the analysis goes deeper in terms of algorithms and storage and is also more domain-driven. Even though data science is used as one branch in analytics, you will see a lot of analytics development. Data scientists who specialize in identifying patterns will go a long way to build patterns that are more replicable.

10) Integrated Analytics

In the future, we can form an integrated view of the above. This could be ONE IDE and you would derive patterns based on business need and use case. Today, we have a fragmented set of tools to manage analytics and it would slowly get integrated into one view.
Oracle has solution at different levels; most of them are also available as a cloud service (Software as a Service, Platform as a Service).

MSE 2017 - Advanced Analytics for Developers


It's imperative to build the right mix of solutions for the right problem and integrate these solutions.

  • Historical perspective you would use --> Business Intelligence 
  • Current processing  -->  Streaming (event processing) and Business Activity Monitoring
  • Enterprise performance management  --> Hyperion
  • Heterogeneous source of data and also large analysis of data --> Big Data Solution
  • Predictive and Prescriptive analytics --> R language and Advanced Analytics
  • Machine related --> IOT Solutions and Cloud Service

Oracle Architectural Elements Enabling R for Big Data


Oracle University provides competency solutions for all the above and empowers you with skill development and well-respected certifications that validate your expertise:


  • Big Data Analytics training
  • BI Data Analytics training
  • Hyperion training
  • Cloud PAAS Platform for Analytics and BI training



More Information:

http://www.oracle.com/technetwork/database/options/advanced-analytics/overview/index.html

http://www.oracle.com/technetwork/database/database-technologies/bdc/r-advanalytics-for-hadoop/overview/index.html

http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/learnmore/user2017-archelements-hornick-3850449.pdf

https://blogs.oracle.com/bigdata/oracles-machine-learning-and-advanced-analytics-122-and-oracle-data-miner-42-new-features

https://blogs.oracle.com/bigdata/announcing-oracle-data-integrator-for-big-data

http://www.oracle.com/us/corporate/events/next-gen-secure-infrastructure-platform/index.html

https://www.nextplatform.com/2017/09/18/m8-last-hurrah-oracle-sparc/

https://blogs.oracle.com/datamining/evaluating-oracle-data-mining-has-never-been-easier-evaluation-kit-available-•-updated-for-oracle-database-122c-sqldev-42

https://blogs.oracle.com/datamining/

https://www.oracle.com/corporate/pressrelease/oracle-forrester-analytics-cloud-092117.html

https://blogs.oracle.com/datawarehousing/compendium/page/2

http://www.odtug.com/p/bl/et/blogaid=713

Oracle Visual Analytics

http://www.prnewswire.com/news-releases/english-releases/oracles-new-sparc-systems-deliver-2-7x-better-performance-security-capabilities-and-efficiency-than-intel-based-systems-300521018.html

https://blogs.oracle.com/datamining/oracle-biwa17-the-big-data-analytics-spatial-cloud-iot-everything-cool-oracle-user-conference-2017

http://www.vlamis.com/papers2017/

https://blogs.oracle.com/datawarehousing/announcing-oracle-advanced-analytics



SQL Server 2017 on Linux

$
0
0


SQL Server 2017 on Linux

Microsoft has heard from you, our customers, that your data estate gets bigger, more complicated, and more diverse every year. You need solutions that work across platforms, whether on-premises or in the cloud, and that meet your data workloads where they are. Embracing this choice, earlier today we announced the general availability of SQL Server 2017 on Linux, Windows, and Docker on October 2, 2017.



Today, Microsoft and Red Hat are delivering on choice by announcing the availability of Microsoft SQL Server 2017 on Red Hat Enterprise Linux, the world’s leading enterprise Linux platform. As Microsoft’s reference Linux platform for SQL Server, Red Hat Enterprise Linux extends the enterprise database and analytics capabilities of SQL Server by delivering it on the industry-leading platform for performance, security features, stability, reliability, and manageability.

Customers will be able to bring the performance and security features of SQL Server to Linux workloads. SQL Server 2017 on Red Hat Enterprise Linux delivers mission-critical OLTP database capabilities and enterprise data warehousing with in-memory technology across workloads. SQL Server 2017 embraces developers by delivering choice in language and platform, with container support that seamlessly facilitates DevOps scenarios. The new release of SQL Server delivers all of this, built-in. And, it runs wherever you want, whether in your datacenter, in Azure virtual machines, or in containers running on Red Hat OpenShift Container Platform!



Also starting October 2nd until June 30th, 2018, we are launching a SQL Server on Red Hat Enterprise Linux offer to help with upgrades and migrations. This offer provides up to 30% off SQL Server 2017 through an annual subscription. When customers purchase a new Red Hat Enterprise Linux subscription to support their SQL Server, they will be eligible for another 30% off their Red Hat Enterprise Linux subscription price.

In addition to discounts on SQL Server and Red Hat Enterprise Linux, all of this is backed by integrated support from Microsoft and Red Hat.

Bootcamp 2017 - SQL Server on Linux


 SQL Server 2017 is generally available for purchase and download! The new release is available right now for evaluation or purchase through the Microsoft Store, and will be available to Volume Licensing customers later today. Customers now have the flexibility for the first time ever to run industry-leading SQL Server on their choice of Linux, Docker Enterprise Edition-certified containers and, of course, Windows Server. It’s a stride forward for our modern and hybrid data platform across on-premises and cloud.

Everything you need to know about SQL Server 2017


In the 18 months since announcing our intent to bring SQL Server to Linux, we’ve been focused on making SQL Server perform and scale to the industry-leading levels customers expect from SQL Server, making SQL Server feel familiar yet native to Linux, and ensuring compatibility between SQL Server on Windows and Linux. With all the enterprise database features you rely on, from Active Directory authentication, to encryption, to Always On availability groups, to record-breaking performance, SQL Server is at parity on Windows and Linux. We have also brought SQL Server Integration Services to Linux so that you can perform data integration just like on Windows. SQL Server 2017 supports Red Hat Enterprise Linux, SUSE Linux Enterprise Server, and Ubuntu.

There are a number of new features for SQL Server that we think make this the best release ever. Here are just a few:

  • Container support seamlessly facilitates your development and DevOps scenarios by enabling you to quickly spin up SQL Server containers and get rid of them when you are finished. SQL Server supports Docker Enterprise Edition, Kubernetes and OpenShift container platforms.
  • AI with R and Python analytics enables you to build intelligent apps using scalable, GPU-accelerated, parallelized R and now Python analytics running in the database.
  • Graph data analysis will enable customers to use graph data storage and query language extensions for graph-native query syntax in order to discover new kinds of relationships in highly interconnected data.
  • Adaptive Query Processing is a new family of features in SQL Server that bring intelligence to database performance. For example, Adaptive Memory Grants in SQL Server track and learn from how much memory is used by a given query to right-size memory grants.
  • Automatic Plan Correction ensures continuous performance by finding and fixing performance regressions.



Above and beyond these top-line features, there are more enhancements that you haven’t heard as much about, but we hope will truly delight you:


  • Resumable online index rebuild lets you stop and start index maintenance. This gives you the ability to optimize index performance by re-indexing more frequently – without having to wait for a long maintenance window. It also means you can pick up right where you left off in the event of a disruption to database service.
  • LOB compression in columnstore indexes. Previously, it was difficult to include data which contained LOBs in a columnstore index due to size. Now those LOBs can be compressed, making LOBs easier to work with and broadening the applicability of the columnstore feature.
  • Clusterless availability groups enable you to scale out reads by building an Always On availability group without having to use an underlying cluster.
  • Continued improvement to key performance features such as columnstore, in-memory OLTP, and the query optimizer to drive new record-setting performance. We’ll share some even more exciting perf and scale numbers soon!
  • Native scoring in T-SQL lets you score operational data using advanced analytics in near real-time because you don’t have to load the Machine Learning libraries to access your model.
  • SQL Server Integration Services (SSIS) scale-out enables you to speed package execution performance by distributing execution to multiple machines. These packages are executed in parallel, in a scale-out mode.
What’s new in SQL Server 2017





Many enhancements were made to SQL Server Analysis Services including:
  • Modern “get data” experience with a number of new connectors like Oracle, MySQL, Sybase, Teradata, and more to come. New transformations enable mashing up of the data being ingested into tabular models.
  • Object-level security for tables and columns.
  • Detail rows and ragged hierarchies support, enabling additional drill-down capabilities for your tabular models.
Enhancements were made to SQL Server Reporting Services as well, including:
  • Lightweight installer with zero impact on your SQL Server databases or other SQL Server features.
  • REST API for programmatic access to reports, KPIs, data sources, and more.
  • Report comments, enabling users to engage in discussion about reports.



In addition to the ability to upgrade existing SQL Server to 2017, there are a few more benefits to renewing your software assurance:


  • Machine Learning Server for Hadoop, formerly R Server, brings R and Python based, scalable analytics to Hadoop and Spark environments, and it is now available to SQL Server Enterprise edition customers as a Software Assurance benefit.
  • SQL Server Enterprise Edition Software Assurance benefits also enable you to run Power BI Report Server. Power BI Report Server enables self-service BI and enterprise reporting, all in one solution by allowing you to manage your SQL Server Reporting Services (SSRS) reports alongside your Power BI reports. Power BI Report Server is also included with the purchase of Power BI Premium.
  • Lastly, but importantly, we are also modernizing how we service SQL Server. Please see our release management blog for all the details on what to expect for servicing SQL Server 2017 and beyond.

Microsoft will continue to invest in SQL Server 2017 and cloud-first development model, to ensure that the pace of innovation stays fast.

SQL Server 2017 sets the standard when it comes to speed and performance. Based on the incredible work of SQL Server 2016 (See the blog series It Just Runs Faster), SQL Server 2017 is fast: built-in, simple, and online. Maybe you caught my presentation at Microsoft Ignite where I demonstrated 1 million transactions per minute on my laptop using the popular tool HammerDB¹ by simply installing SQL Server out of the box with no configuration changes (with the HammerDB client and SQL Server on the same machine!)

SQL Server 2017 on Linux Introduction


Consider for a minute all the built-in capabilities that power the speed of SQL Server. From a SQLOS scheduling engine that minimizes OS context switches to read-ahead scanning to automatic scaling as you add NUMA and CPUs. And we parallelize everything! From queries to indexes to statistics to backups to recovery to background threads like LogWriter. We partition and parallelize our engine to scale from your laptop to the biggest servers in the world.

Like the enhancements we made as described in It Just Runs Faster, in SQL Server 2016, we are always looking to tune our engine for speed, all based on customer experiences. Take, for example, indirect checkpoint, which is designed to provide a more predictable recovery time for a database. We boosted scalability of this feature based on customer feedback. We also made scalability improvements for parallel scanning and consistency check performance. No knobs required. Just built-in for speed.

One of the coolest performance aspects to built-in speed is online operations. We know you need to perform other maintenance tasks than just run queries, but keep your application up and running, so we support online backups, consistency checks, and index rebuilds. SQL Server 2017 enhances this functionality with resumable online index builds allowing you to pause an index build and resume it at any time (even after a failure).

Microsoft SQL Server 2017 Deep Dive


SQL Server 2017 is faster than you think. SQL Server 2017 was designed from the beginning to run fast on popular Linux distributions such as Red Hat Enterprise Linux, SUSE Linux Enterprise, and Ubuntu whether that is on your server or a Docker Container. Don’t believe it? Check out our world record 1TB TPC-H benchmark result (non-clustered) for SQL Server on Red Hat Enterprise Linux. Even though this is our first release on Linux, we know how to configure and run SQL Server on Linux for maximum speed. Read our best practices guide for performance settings on Linux in our documentation. We know it performs well because our customers tell us. Read the amazing story of dv01 and how SQL Server on Linux exceeded their performance expectations as they migrated from PostgreSQL

SQL Server 2017 Deep Dive - @Ignite 2017


One of the key technologies to achieve a result like this is columnstore indexes. This is one of the most powerful features of SQL Server for high-speed analytic queries and large databases. Columnstore indexes boost performance by organizing columnar data in a new way than traditional indexes, compressing data to reduce memory and disk footprint, filtering scans automatically through rowgroup elimination and processing queries in batches. SQL Server runs at warp speed for data warehouses and columnstore is the fuel. At Microsoft Ignite, I demonstrated how columnstore indexes can make PowerBI with Direct Query against SQL Server faster handling the self-service, ad-hoc nature of PowerBI queries.

Microsoft Ignite 2017 - SQL Server on Kubernetes, Swarm, and Open Shift


SQL Server also excels at transaction processing, the heart of many top enterprise workloads. Got RAM? Not only does columnstore use in-memory technologies to achieve speed, but our In-Memory OLTP feature focuses on optimized access to memory-optimized tables. This feature is named OLTP, but it can be so much more. ETL staging tables, IoT workloads, table types (no more tempdb!), and “caching” tables. One of our customers was able to get a throughput of 1.2M batch requests/sec using SCHEMA_ONLY memory-optimized tables. To really boost transaction processing, also consider using SQL Server’s support for Persistent Memory (NVDIMM-N) and our optimization for transaction log (get ready for WRITELOG waits = 0!) performance. SQL Server 2017 supports any Persistent Memory technology supported on Windows Server 2016 and later releases.

Many customers I talk to have great performance when they first deploy SQL Server and their application. Keeping SQL Server fast and tuned is more of the challenge. SQL Server 2017 comes with features to keep you fast and tuned automatically and adaptively. Our Query Processing engine has all types of capabilities to create and build query plans to maximize the performance of your queries. We have created a new feature family in SQL Server 2017 to make it smarter, called Adaptive Query Processing. Imagine running a query that is not quite the speed you expect because of insufficient memory grants (which is a thorn in the side of many SQL Server users, as it can lead to a spill to tempdb). With Adaptive Query Processing, future executions of this query will have a corrected calculated memory grant avoiding the spill, all without requiring a recompilation of the query plan. Adaptive Query Processing handles other scenarios such as adaptive joins and interleaved execution of Table Valued Functions.

Choosing technologies for a big data solution in the cloud


Another way to keep you tuned is the amazing feature we added in SQL Server 2016 called Query Store. Query Store provides built-in capabilities to track and analyze query performance all stored in your database. For SQL Server 2017, we made tuning adjustments to Query Store to make it more efficient based on learnings in our Azure SQL Database service where Query Store is enabled for millions of databases. We added wait statistics so now you have an end-to-end picture of query performance. Perhaps though the most compelling enhancement in SQL Server 2017 is Automatic Tuning. Parameter Sniffing got you down? Automatic Tuning uses Query Store to detect query plan regressions and automatically forces a previous plan that used to run fast. What I love about this feature is that even if you don’t have it turned on, you can see recommendations it has detected about plan regressions. Then you can either manually force plans that you feel have regressed or turn on the feature to have SQL Server do it for you.

Introduction to PolyBase


SQL Server 2017 is the fastest database everywhere you need it. Whether it is your laptop, in your private cloud, or in our Azure public cloud infrastructure. Whether it is running on Linux, Windows, or Docker Containers, we have the speed to power any workload your application needs.

As I mentioned above, back in April, we announced our world record TPC-H 1TB data warehousing workload (non-clustered) for SQL Server 2017 running on a HPE ProLiant DL380 Gen9 using RedHat Enterprise Linux².

Perhaps you missed the announcement in June of 2017, of a new world record TPC-E benchmark result³ on SQL Server 2017 on Windows Server 2016 running on a Lenovo ThinkSystem SR650 continuing to demonstrate our leadership in database performance. This benchmark running on a 2 socket system using Intel’s Xeon Scalable Processors has set a new standard for price and performance, becoming the first TPC-E benchmark result ever to be under $100/tpsE.

We continued to show our proven speed for analytics by announcing in July of 2017 a new TPC-H 10TB (non-clustered) world record benchmark result4 of 1,336,109 QppH on Windows Server 2016 using a Lenovo ThinkSystem SR950 system with 6TB RAM and 224 logical CPUs.

While benchmarks can show the true speed of SQL Server, we believe it can perform well with your workload and maximize the computing power of your server. Perhaps you caught the session at Ignite where my colleague Travis Wright showed how we can scan a 180 Billion row table (from a 30TB database) in our labs in under 20 seconds powering 480 CPUs to 100% capacity. And if you don’t believe SQL Server is deployed in some of the biggest installations and servers in the world, I recently polled some of our field engineers, SQL Customer Advisor Team, and MVPs asking them for their largest SQL Server deployments. Over 30 people responded, and the average footprint of these installations was 3TB+ RAM on machines with 128 physical cores. Keep in mind that SQL Server on can theoretically scale to 24TB RAM on Windows and 64TB on Linux. And it supports the maximum CPUs of those systems (64 sockets with unlimited cores on Windows and 5120 logical CPUs on Linux). Look for more practical and fun demonstrations of the speed of SQL Server in the future.

Microsoft cloud big data strategy


It could be that you are consolidating your deployments and want to run SQL Server using Azure Virtual Machine, but not sure if the capacity is there for your performance needs. Consider that Azure Virtual machine has the new M-Series, which supports up to 128 vCPUs, 2TB RAM, and 64 Data Disks with a capacity of 160,000 IOPS. It could be that in your environment you want to scale out your read workload with Availability Group secondary replicas but don’t want to invest in Failover Clustering. SQL Server 2017 introduces the capability of read-scale availability groups without clustering supported both on Windows and Linux. Two other very nice performance features new to SQL Server 2017 are SSIS Scale Out, for those with data loading needs, and native scoring, which integrates machine learning algorithms into the SQL Server engine for maximum performance.

Microsoft Technologies for Data Science 201612


SQL Server 2017 brings to the database market a unique set of features and speed. A database engine that is fast, built-in with the power to scale, and even faster when taking advantage of technologies like columnstore Indexes and In-Memory OLTP. An engine that provides automation and adapts to keep you fast and tuned. And the fastest database everywhere you need it.

Machine learning services with SQL Server 2017

More Information:

https://docs.microsoft.com/en-us/sql/sql-server/sql-server-2017-release-notes

http://www.databasejournal.com/features/mssql/slideshows/9-new-features-with-sql-server-2017.html

https://www.microsoft.com/en-us/sql-server/sql-server-2017

https://myignite.microsoft.com/sessions

https://blogs.technet.microsoft.com/dataplatforminsider/2017/10/02/sql-server-2017-on-windows-linux-and-docker-is-now-generally-available/

https://blogs.msdn.microsoft.com/bobsql/

http://www.hammerdb.com

http://www.hammerdb.com/benchmarks.html

https://blogs.msdn.microsoft.com/sqlserverstorageengine/

https://blogs.technet.microsoft.com/dataplatforminsider/2017/09/28/enhancing-query-performance-with-adaptive-query-processing-in-sql-server-2017/

https://blogs.technet.microsoft.com/dataplatforminsider/2017/09/29/view-on-demand-microsoft-data-platform-sql-server-2017-and-azure-data-services/

https://blogs.technet.microsoft.com/dataplatforminsider/2017/10/10/whats-new-in-sql-server-management-studio-17-3/

https://redmondmag.com/articles/2017/09/25/microsoft-launches-sql-server-2017.aspx

https://redmondmag.com/articles/2017/04/19/sql-server-2017-preview.aspx

https://arstechnica.com/gadgets/2017/09/microsoft-ignite-2017-azure-sql/











Oracle Introduces Autonomous Database Cloud, Robotic Security

$
0
0

Oracle Introduces Autonomous Data Warehouse Cloud that has Demonstrated Performance of 10x Faster at Half the Cost of Amazon

The World’s #1 Database Is Now the World’s First Self-Driving Database.   

Oracle Sets New Standard with World’s First Autonomous Database

Journey to Autonomous Database


Oracle is revolutionizing how data is managed with the introduction of the world’s first "self-driving" database. This ground-breaking Oracle Database technology automates management to deliver unprecedented availability, performance, and security—at a significantly lower cost.

Powered by Oracle Database 18c, the next generation of the industry-leading database, Oracle Autonomous Database Cloud offers total automation based on machine learning and eliminates human labor, human error, and manual tuning.


Get unmatched reliability and performance at half the cost.

  • No Human Labor: Database automatically upgrades, patches, and tunes itself while running; automates security updates with no downtime window required.
  • No Human Error: SLA guarantees 99.995% reliability and availability, which minimizes costly planned and unplanned downtime to less than 30 minutes a year.
  • No Manual Performance Tuning: Database consumes less compute and storage because of machine learning and automatic compression. Combined with lower manual admin costs, Oracle offers even bigger cost savings.

At Oracle OpenWorld 2017, Oracle Chairman of the Board and CTO Larry Ellison unveiled his vision for the world’s first autonomous database cloud.

Oracle OpenWorld 2017 Review (31st October 2017 - 250 slides)


Powered by Oracle Database 18c, the next generation of the industry-leading database, Oracle Autonomous Database Cloud uses ground-breaking machine learning to enable automation that eliminates human labor, human error and manual tuning, to enable unprecedented availability, high performance and security at a much lower cost.

Oracle database cloud architecture | Video tutorial


“This is the most important thing we’ve done in a long, long time,” said Ellison. “The automation does everything. We can guarantee availability of 99.995 percent, less than 30 minutes of planned or unplanned downtime.”



The Oracle Autonomous Database Cloud eliminates the human labor associated with tuning, patching, updating and maintaining the database and includes the following capabilities:

Self-Driving: Provides continuous adaptive performance tuning based on machine learning. Automatically upgrades and patches itself while running. Automatically applies security updates while running to protect against cyberattacks.
Self-Scaling: Instantly resizes compute and storage without downtime. Cost savings are multiplied because Oracle Autonomous Database Cloud consumes less compute and storage than Amazon, with lower manual administration costs.
Self-Repairing: Provides automated protection from downtime. SLA guarantees 99.995 percent reliability and availability, which reduces costly planned and unplanned downtime to less than 30-minutes per year.

The Oracle Autonomous Database Cloud handles many different workload styles, including transactions, mixed workloads, data warehouses, graph analytics, departmental applications, document stores and IoT. The first Autonomous Database Cloud offering, for data warehouse workloads, is planned to be available in calendar year 2017.




Oracle Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud is a next-generation cloud service built on the self-driving Oracle Autonomous Database technology using machine learning to deliver unprecedented performance, reliability and ease of deployment for data warehouses. As an autonomous cloud service, it eliminates error-prone manual management tasks and frees up DBA resources, which can now be applied to implementing more strategic business projects.

“Every organization is trying to leverage the overwhelming amount of data generated in our digital economy,” said Carl Olofson, research vice president, IDC. “With a history of established leadership in the database software market segment, it is no surprise that Oracle is pioneering a next-generation data management platform. Oracle Autonomous Data Warehouse Cloud is designed to deliver industry-leading database technology performance with unmatched flexibility, enterprise scale and simplicity. The intent is to ensure that businesses get more value from their data and modernize how data is managed.”

Highlights of the Oracle Autonomous Data Warehouse Cloud include:


Simplicity: Unlike traditional cloud services with complex, manual configurations that require a database expert to specify data distribution keys and sort keys, build indexes, reorganize data or adjust compression, Oracle Autonomous Data Warehouse Cloud is a simple “load and go” service. Users specify tables, load data and then run their workloads in a matter of seconds—no manual tuning is needed.
Industry-Leading Performance: Unlike traditional cloud services, which use generic compute shapes for database cloud services, Oracle Autonomous Data Warehouse Cloud is built on the high-performance Oracle Exadata platform. Performance is further enhanced by fully-integrated machine learning algorithms which drive automatic caching, adaptive indexing and advanced compression.
Instant Elasticity: Oracle Autonomous Data Warehouse Cloud allocates new data warehouses of any size in seconds and scales compute and storage resources independently of one another with no downtime. Elasticity enables customers to pay for exactly the resources that the database workloads require as they grow and shrink.

Oracle Database 18c

Oracle OpenWorld 2017: Keynote by Larry Ellison


Oracle Autonomous Database Cloud is powered by the next generation of the world’s #1 database, Oracle Database 18c. Oracle Database 18c delivers breakthrough automation capabilities, as well as greatly enhanced OLTP, analytics and consolidation technologies.

If everything Oracle CTO and co-founder Larry Ellison said the evening of Oct. 1 is true, then the company's board of directors, investors and stockholders had better have a meeting and find out whether Oracle will actually be able to make a profit from this new-fangled cloud-service business.

Highly Automated IT



Ellison spent a good portion of his opening keynote at Oracle OpenWorld 2017 demonstrating how "cheap" Oracle's in-cloud workload processing is versus Amazon Web Services' RDS (relational database system). He explained in a series of demos that because Oracle's cloud service is anywhere from 6 to 15 times faster that AWS in processing the same workload, Oracle thus is 6 to 15 times "cheaper" than AWS.

Case in point: For the same "market research" workload, Ellison pitted an 8-node Oracle Autonomous Data Warehouse Cloud instance against a similar 8-node AWS DS2.xlarge cloud. The same eight queries were fed to both cloud services.

Partner Webcast – Data Management Platform for Innovation



Timers were started. Oracle claimed AWS's processors took 244 seconds to do the job, costing the user 27 cents' worth of computing time. Oracle then claimed its own service took a  mere 38 seconds to do the job, costing the user 2 cents' worth of cloudtime. This is going to be the new normal for the super-fast new DB, Ellison contended.

Larry Ellison introduced not only the aforementioned "world's first autonomous database cloud," but also an as-yet unnamed automated security product, which he said he would detail later in the week. He also used some of his time onstage to skewer Equifax following the security breach it suffered earlier this year in which more than 140 million people had their personal credit information compromised. We'll get to that in a minute.

Oracle Technology Monthly Oktober 2017


Larry Ellison said both the Autonomous Database Cloud and the security system use machine learning for automation to eliminate human labor, human error and manual tuning, and to enable availability, high performance and security at a much lower cost than competitors that include AWS.

“These systems are highly, highly automated--this means we do everything we can to avoid human intervention," Larry Ellison said. "This is the most important thing we've done in a long, long time. The automation does everything.

Security is 'Robotic,''Autonomous'

"They're robotic; they're autonomous. In security, it's our computers versus their (hackers') computers. It's cyber warfare. We have to have a lot better computer systems, a lot more automation if we're going to defend our data."

Ellison said the automated security system would scan the entire system 24/7, know immediately when an intruder gets into it, and would be able to stop and isolate the intruder faster than any human can do it. He didn't mention that there are already systems out there that do the same thing, such as Vectra Networks, Vera and others.

MOUG17 Keynote: What's New from Oracle Database Development


On the DB side, Ellison said the database cloud eliminates human labor that touches tuning, patching, updating and maintaining the database. The company listed the following capabilities:

Self-Driving: Provides continuous adaptive performance tuning based on machine learning. Automatically upgrades and patches itself while running. Automatically applies security updates while running to protect against cyberattacks.
Self-Scaling: Instantly resizes compute and storage without downtime. Cost savings are multiplied because Oracle Autonomous Database Cloud consumes less compute and storage than Amazon, with lower manual administration costs.




Self-Repairing: Provides automated protection from downtime. SLA guarantees 99.995 percent reliability and availability, which reduces costly planned and unplanned downtime to less than 30-minutes per year.
Oracle said the database cloud is designed to handle a high number of different workloads, including transactions, mixed workloads, data warehouses, graph analytics, departmental applications, document stores and IoT.

The first Autonomous Database Cloud offering, for data warehouse workloads, is planned to be available in calendar year 2017, Ellison said.

Details on Oracle's Autonomous Data Warehouse Cloud

Oracle Autonomous Data Warehouse Cloud ostensibly eliminates error-prone manual management tasks and frees up DBA resources, which can now be applied to implementing more strategic business projects.

Key features, according to Oracle, include:


  • Simplicity: Unlike traditional cloud services with complex, manual configurations that require a database expert to specify data distribution keys and sort keys, build indexes, reorganize data or adjust compression, Oracle Autonomous Data Warehouse Cloud is a simple “load and go” service. Users specify tables, load data and then run their workloads in a matter of seconds—no manual tuning is needed.
  • Performance: Unlike conventional cloud services, which use generic compute shapes for database cloud services, Oracle Autonomous Data Warehouse Cloud is built on the high-performance Oracle Exadata platform. Performance is further enhanced by fully-integrated machine learning algorithms which drive automatic caching, adaptive indexing and advanced compression.
  • Elasticity: Oracle Autonomous Data Warehouse Cloud allocates new data warehouses of any size in seconds and scales compute and storage resources independently of one another with no downtime. Elasticity enables customers to pay for exactly the resources that the database workloads require as they grow and shrink.
  • Oracle Autonomous Database Cloud: is powered by the company's latest database, Oracle Database 18c. Oracle Database 18c offers new automation capabilities in addition to enhanced OLTP, analytics and consolidation technologies.


Ellison on Equifax, Security

"You've got to know (about a breach) during the reconnaissance phase of a cyber attack," Ellison said, "when someone is nosing around in your computer system--trying to steal a password, trying to steal someone's identity. As they come in and start looking around, you'd better detect that that's happening."

Ellison chastised Equifax for not patching its system in time.

"I know it's a shock, but there was a patch available for Equifax, but somebody didn't apply it. I saw where the CEO  lost his job--which doesn't bother me now, I'm not a CEO. That's a risky job those guys have," Ellison said with a slight laugh. "But no, I'd lose my job, too. It's a clean sweep (with a breach like Equifax's); directors aren't safe, nobody's safe when something like that happens."

This is going to get a lot worse before it gets better, Ellison said.

"People are going to get better at stealing data; we have to get better at protecting it," he said.

The Oracle Autonomous Cloud will become available on-premises or in the Oracle public or private clouds for data warehousing production workloads in December. It will become available for other specific workloads in June 2018.

Databases Are Moving to Cloud, Are You?

With Oracle CTO Larry Ellison’s Keynote at Oracle OpenWorld 2017 #oow17 about World’s first Autonomous Database 18c  I received 100s of messages, some worried ‘Is DBA as a Career Over‘ while others ready to prepare for now & future ‘What Oracle DBAs should do to prepare for Future or for Cloud‘

MOUG17: DBA 2.0 is Dead; Long Live DBA 3.0 (How to Prepare!)


First of all you as a DBA, there is nothing to worry about. There will still be a role of DBAs but it will transition from less of routine tasks like Install, Patch or Upgrade to more innovative tasks like Architecture Design, Deployment, Security, Integration, and Migration to Cloud.

Databases are already moving to Cloud and over next few years, more and more Databases will move to Cloud (mainly in PaaS space – If you are not familiar with SaaS, PaaS & IaaS then check here).

In my view, every change brings new opportunity & here is your chance to learn about this new Role as Oracle Cloud DBA to stay ahead in professional Career, Earn More & Enjoy What You Do.

Role of Oracle Cloud DBA

One of the most common questions being asked in our private Facebook Group dedicated to Oracle Cloud and those join our Oracle Database Cloud Administration (Cloud DBA) Training is about Role of Cloud DBA

Oracle Database Cloud: DBCS architecture for DBAs


Looking at so many requests, I created a video on How the Role of DBA changes with Cloud and how the Roles and Responsibilities changes as you upgrade yourself from DBA to Cloud DBA.

Cloud DBA: Role of DBA in Cloud


As shown in the Video these are the following tasks that you will be doing as Oracle Cloud DBA:


  • Design & Specifications of Database i.e. CPU, Memory, Disk Space, Future Growth, High Availability & Disaster Recovery (Yes even in Cloud, you have to consider HA & DR).
  • Creating & Configuration of Oracle Database is very simple in Cloud with Click of a Button or with REST API & JSON but someone need to invoke these Scripts or Click (Cloud DBA will perform this task)
  • As an Oracle Cloud DBA, you need to learn new tools for Start/Stop i.e. DBaaSCLI (Database as a Service Command Line Interface) or Database Service Console
  • Still, you have to Patch but using new tools like DBaaSCLI with DBPATCHM or RACCLI
  • You still have to do the Back-up and Recovery using new tools and Oracle Storage Service (OSS).

Patching an Oracle Database Cloud Service


  • You still will be learning about Migration to the Cloud (Lift & Shift) and this is where you can expect a lot of work to migrate existing on-premise databases to Cloud.

Oracle Multitenant: Isolation and Agility with Economies of Scale

  • You still will be learning about Disaster Recovery (or Data Guard ) in Cloud and setting up Disaster Recovery (DR) on Cloud for Data Center Failover (Yes this is not available out of Box & You as Cloud DBA will have to set it up).

Oracle RAC on Oracle Database Cloud Bare Metal Services

  • You still will be configuring RAC on Cloud or you will be deploying maximum availability Architecture. Check my Video Blog on Setting Up RAC Database in Oracle Cloud
  • You still will be configuring OEM CC 13c: Hybrid Cloud Management to Manage both On-Premise & Cloud Database or using Oracle Management Cloud (OMC) 

Oracle Bare Metal Cloud Services overview

Experience of being a Oracle Bare Metal Cloud DBA by Satyendra Pasalapudi




Pluggable Databases on Oracle Cloud

More Information:

https://www.oracle.com/corporate/pressrelease/oow17-oracle-autonomous-database-100217.html

https://www.oracle.com/database/autonomous-database/feature.html

https://www.forbes.com/sites/oracle/2017/10/02/larry-ellison-introduces-a-big-deal-the-oracle-autonomous-database/#6a5e2c04f0b0

https://www.youtube.com/channel/UCr6mzwq_gcdsefQWBI72wIQ/videos

http://dbastuff.blogspot.nl/2017/10/

https://www.oracle.com/database/autonomous-database/index.html

https://oracle-base.com/blog/2017/10/02/oracle-autonomous-database-and-the-death-of-the-dba/

http://oracle-help.com/articles/oracle-18c-future-database/

http://www.zdnet.com/article/oracle-launches-18c-its-autonomous-database-and-automated-cybersecurity-system/

http://db.geeksinsight.com/presentations-notes/

http://www.oracle.com/us/products/database/autonomous-dw-cloud-ipaper-3938921.pdf


IBM Big Data Platform

$
0
0




What is big data?

Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is big data.

Enterprise Data Warehouse Optimization: 7 Keys to Success



Big data spans three dimensions: Volume, Velocity and Variety.

Volume: Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information.

Turn 12 terabytes of Tweets created each day into improved product sentiment analysis
Convert 350 billion annual meter readings to better predict power consumption

Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.

Scrutinize 5 million trade events created each day to identify potential fraud
Analyze 500 million daily call detail records in real-time to predict customer churn faster

Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together.

Overview - IBM Big Data Platform




Monitor 100’s of live video feeds from surveillance cameras to target points of interest
Exploit the 80% data growth in images, video and documents to improve customer satisfaction


Big data is more than simply a matter of size; it is an opportunity to find insights in new and emerging types of data and content, to make your business more agile, and to answer questions that were previously considered beyond your reach. Until now, there was no practical way to harvest this opportunity. Today, IBM’s platform for big data uses state of the art technologies including patented advanced analytics to open the door to a world of possibilities.

IBM big data platform

Data Science Experience: Build SQL queries with Apache Spark



Do you have a big data strategy? IBM does. We’d like to share our know-how with you to help your enterprise solve its big data challenges.

IBM is unique in having developed an enterprise class big data platform that allows you to address the full spectrum of big data business challenges.

The platform blends traditional technologies that are well suited for structured, repeatable tasks together with complementary new technologies that address speed and flexibility and are ideal for adhoc data exploration, discovery and unstructured analysis.
IBM’s integrated big data platform has four core capabilities: Hadoop-based analytics, stream computing, data warehousing, and information integration and governance.





Fig. 1 - IBM big data platform




The core capabilities are:

Hadoop-based analytics: Processes and analyzes any data type across commodity server clusters.
Stream Computing: Drives continuous analysis of massive volumes of streaming data with sub-millisecond response times.
Data Warehousing: Delivers deep operational insight with advanced in-database analytics.
Information Integration and Governance: Allows you to understand, cleanse, transform, govern and deliver trusted information to your critical business initiatives.

Delight Clients with Data Science on the IBM Integrated Analytics System


Supporting Platform Services:

Visualization & Discovery: Helps end users explore large, complex data sets.
Application Development: Streamlines the process of developing big data applications.
Systems Management: Monitors and manages big data systems for secure and optimized performance.
Accelerators: Speeds time to value with analytical and industry-specific modules.

IBM DB2 analytics accelerator on IBM integrated analytics system technical overview







How Big Data and Predictive Analytics are revolutionizing AML and Financial Crime Detection


Big data in action

What types of business problems can a big data platform help you address? There are multiple uses for big data in every industry – from analyzing larger volumes of data than was previously possible to drive more precise answers, to analyzing data in motion to capture opportunities that were previously lost. A big data platform will enable your organization to tackle complex problems that previously could not be solved.

Big data = Big Return on Investment (ROI)

While there is a lot of buzz about big data in the market, it isn’t hype. Plenty of customers are seeing tangible ROI using IBM solutions to address their big data challenges:

Healthcare: 20% decrease in patient mortality by analyzing streaming patient data
Telco: 92% decrease in processing time by analyzing networking and call data
Utilities: 99% improved accuracy in placing power generation resources by analyzing 2.8 petabytes of untapped data

IBM’s big data platform is helping enterprises across all industries. IBM understands the business challenges and dynamics of your industry and we can help you make the most of all your information.

The Analytic Platform behind IBM’s Watson Data Platform - Big Data



When companies can analyze ALL of their available data, rather than a subset, they gain a powerful advantage over their competition. IBM has the technology and the expertise to apply big data solutions in a way that addresses your specific business problems and delivers rapid return on investment.

The data stored in the cloud environment is organized into repositories. These repositories may be hosted on different data platforms (such as a database server, Hadoop, or a NoSQL data platform) that are tuned to support the types of analytics workload that is accessing the data.

What’s new in predictive analytics: IBM SPSS and IBM decision optimization


The data that is stored in the repositories may come from legacy, new, and streaming sources, enterprise applications, enterprise data, cleansed and reference data, as well as output from streaming analytics.

Breaching the 100TB Mark with SQL Over Hadoop



Types of data repositories include:

  • Catalog: Results from discovery and IT data curation create a consolidated view of information that is reflected in a catalog. The introduction of big data increases the need for catalogs that describe what data is stored, its classification, ownership, and related information governance definitions. From this catalog, you can control the usage of the data.
  • Data virtualization:Agile approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data
  • Landing, exploration, and archive: Allows for large datasets to be stored, explored, and augmented using a wide variety of tools since massive and unstructured datasets may mean that it is no longer feasible to design the data set before entering any data. Data may be used for archival purposes with improved availability and resiliency thanks to multiple copies distributed across commodity storage.

SparkR Best Practices for R Data Scientists
  • Deep analytics and modeling: The application of statistical models to yield information from large data sets comprised of both unstructured and semi-structured elements. Deep analysis involves precisely targeted and complex queries with results measured in petabytes and exabytes. Requirements for real-time or near-real-time responses are becoming more common.
  • Interactive analysis and reporting: Tools to answer business and operations questions over Internet-scale data sets. Tools also use popular spreadsheet interfaces for self-service data access and visualization. APIs implemented by data repositories allow output to be efficiently consumed by applications.
  • Data warehousing: Populates relational databases that are designed for building a correlated view of business operation. A data warehouse usually contains historical and summary data derived from transaction data but can also integrate data from other sources. Warehouses typically store subject-oriented, non-volatile, time-series data used for corporate decision-making. Workloads are query intensive, accessing millions of records to facilitate scans, joins, and aggregations. Query throughput and response times are generally a priority.

IBM Power leading Cognitive Systems





IBM offers a wide variety of offerings for consideration in building data repositories:
  • InfoSphere Information Governance Catalog maintains a repository to support the catalog of the data lake. This repository can be accessed through APIs and can be used to understand and analyze the types of data stored in the other data repositories.
  • IBM InfoSphere Federation Server creates consolidated information views of your data to support key business processes and decisions.
  • IBM BigInsights for Apache Hadoop delivers key capabilities to accelerate the time to value for a data science team, which includes business analysts, data architects, and data scientists.
  • IBM PureData™ System for Analytics, powered by Netezza technology, is changing the game for data warehouse appliances by unlocking data's true potential. The new IBM PureData System for Analytics is an integral part of a logical data warehouse.
  • IBM Analytics for Apache Spark is a fully-managed Spark service that can help simplify advanced analytics and speed development.
  • IBM BLU Acceleration® is a revolutionary, simple-to-use, in-memory technology that is designed for high-performance analytics and data-intensive reporting.
  • IBM PureData System for Operational Analytics is an expert integrated data system optimized specifically for the demands of an operational analytics workload. A complete solution for operational analytics, the system provides both the simplicity of an appliance and the flexibility of a custom solution.

IBM Big Data Analytics Concepts and Use Cases





Bluemix offers a wide variety of services for data repositories:

  • BigInsights for Apache Hadoop provisions enterprise-scale, multi-node big data clusters on the IBM SoftLayer cloud. Once provisioned, these clusters can be managed and accessed from this same service.

Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical platform
  • Cloudant® NoSQL Database is a NoSQL Database as a Service (DBaaS). It's built from the ground up to scale globally, run non-stop, and handle a wide variety of data types like JSON, full-text, and geospatial. Cloudant NoSQL DB is an operational data store optimized to handle concurrent reads and writes and provide high availability and data durability.
  • dashDB™ stores relational data, including special types such as geospatial data. Then analyze that data with SQL or advanced built-in analytics like predictive analytics and data mining, analytics with R, and geospatial analytics. You can leverage the in-memory database technology to use both columnar and row-based tables. The dashDB web console handles common data management tasks, such as loading data, and analytics tasks like running queries and R scripts.

IBM BigInsights: Smart Analytics for Big Data



IBM product support for big data and analytics solutions in the cloud

Now that we've reviewed the component model for a big data and analytics solution in the cloud, let's look at how IBM products can be used to implement a big data and analytics solution. In previous sections, we highlighted IBM's end-to-end solution for deploying a big data and analytics solution in cloud.
The figure below shows how IBM products map to specific components in the reference architecture.

Figure 5. IBM product mapping



Ml, AI and IBM Watson - 101 for Business





IBM product support for data lakes using cloud architecture capabilities

The following images show how IBM products can be used to implement a data lake solution. In previous sections, we highlighted IBM's end-to-end solution for deploying data lake solutions using cloud computing.

Benefits of Transferring Real-Time Data to Hadoop at Scale





Mapping on-premises and SoftLayer products to specific capabilities

Figure 7 shows how IBM products can be used to run a data lake in the cloud.

Figure 7. IBM product mapping for a data lake using cloud computing


What is Big Data University?



Big Data Scotland 2017

Big Data Scotland is an annual data analytics conference held in Scotland. Run by DIGIT in association with The Data Lab, it is free for delegates to attend. The conference is geared towards senior technologists and business leaders and aims to provide a unique forum for knowledge exchange, discussion and cross-pollination.



The programme will explore the evolution of data analytics; looking at key tools and techniques and how these can be applied to deliver practical insight and value. Presentations will span a wide array of topics from Data Wrangling and Visualisation to AI, Chatbots and Industry 4.0.

https://www.thedatalab.com/






More Information:


https://www.ibm.com/developerworks/cloud/library/cl-ibm-leads-building-big-data-analytics-solutions-cloud-trs/index.html#N10642

https://www.ibm.com/developerworks/learn/

https://www.ibm.com/developerworks/learn/analytics/

https://cognitiveclass.ai/learn/big-data/

http://www.ibmbigdatahub.com/blog/top-10-ibm-big-data-analytics-hub-podcasts-2017

https://www.ibm.com/power/solutions/bigdata-analytics

https://www-935.ibm.com/services/big-data/

https://www.ibm.com/analytics/hadoop/big-data-analytics

https://www.dwbisummit.com/?lang=en

http://www.ibmbigdatahub.com

https://www.forbes.com/sites/chrisversace/2014/04/01/talking-big-data-and-analytics-with-ibm/#2aff2108a66e

Micro-segmentation Defined – NSX Securing "Anywhere"

$
0
0


Why It’s Time to Build a Zero Trust Network


Network security, for a long time, has worked off of the old Russian maxim, “trust but verify.” Trust a user, but verify it’s them. However, today’s network landscape — where the Internet of Things, the Cloud, and more are introducing new vulnerabilities — makes the “verify” part of “trust but verify” difficult and inefficient. We need a simpler security model. That model: Zero Trust.

The Next Generation Network model
VMware NSX and Micro-Segmentation


Forrester Research coined the term “Zero Trust” to describe a model that prevents common and advanced persistent threats from traversing laterally inside a network. This can be done through a strict, micro-granular security model that ties security to individual workloads and automatically provisions policies. It’s a network that doesn’t trust any data packets. Everything is untrusted. Hence: Zero Trust.



So how can you deploy the Zero Trust model? Should you? To answer these questions and more, we’ve gathered John Kindervag, VP and Principal Analyst at Forrester Research, and our own VMware NSX experts to discuss Zero Trust, micro-segmentation and how VMware NSX makes it all happen our webinar, “Enhancing Security with Zero Trust, The Software-Defined Data Center, and Micro-segmentation.” Best of all: you can watch it on-demand, on your own.

VMware NSX Security and Micro-segmentation


VMware NSX is the network virtualization platform for the Software-Defined Data Center. NSX brings the operational model of virtual machines to your data center network. This allows your organization to overcome the hardware-defined economic and operational hurdles keeping you from adopting a Zero Trust model and better overall security.



To learn more about how VMware NSX can help you be twice as secure at half the cost, visit the NSX homepage and follow @VMwareNSX on Twitter for the latest in micro-segmentation news.

The landscape of the modern data center is rapidly evolving. The migration from physical to virtualized workloads, move towards software-defined data centers, advent of a multi-cloud landscape, proliferation of mobile devices accessing the corporate data center, and adoption of new architectural and deployment models such as microservices and containers has assured the only constant in modern data center evolution is the quest for higher levels of agility and service efficiency. This march forward is not without peril as security often ends up being an afterthought. The operational dexterity achieved through the ability to rapidly deploy new applications overtakes the ability of traditional networking and security controls to maintain an acceptable security posture for those application workloads. That is in addition to a fundamental problem of traditionally structured security not working adequately in more conventional and static data centers.

VMware NSX for vSphere - Intro and use cases



Without a flexible approach to risk management, which adapts to the onset of new technology paradigms, security silos using disparate approaches are created. These silos act as control islands, making it difficult to apply risk-focused predictability into your corporate security posture, causing unforeseen risks to be realized. These actualized risks cause an organization’s attack surface to grow as the adoption of new compute technology increases, causing susceptibility to increasing advanced threat actors.

A foundational aspect of solving this problem is the ability to implement micro-segmentation anywhere. NSX is a networking and security platform able to deliver micro-segmentation across all the evolving components comprising the modern datacenter. NSX based micro-segmentation enables you to increase the agility and efficiency of your data center while maintaining an acceptable security posture. The following blog series will define the necessary characteristics of micro-segmentation as needed to provide effective security controls within the modern data center and demonstrate how NSX goes beyond the automation of legacy security paradigms in enabling security through micro-segmentation.

ACCEPTABLE SECURITY IN THE MODERN DATA CENTER

It is no longer acceptable to utilize the traditional approach to data-center network security built around a very strong perimeter defense but virtually no protection inside the perimeter. This model offers very little protection against the most common and costly attacks occurring against organizations today, which include attack vectors originating within the perimeter. These attacks infiltrate your perimeter, learn your internal infrastructure, and laterally spread through your data center.

Architecting-in Security with Micro-Segmentation


The ideal solution to complete datacenter protection is to protect every traffic flow inside the data center with a firewall and only allow the flows required for applications to function.  This is also known as the Zero Trust model.  Achieving this level of protection and granularity with a traditional firewall is operationally unfeasible and cost prohibitive, as it would require traffic to be hair-pinned to a central firewall and virtual machines to be placed on individual VLANs (also known as pools of security).

Wade Holmes - Tackling Security Concerns with Micro segmentation


A typical 1 Rack-Unit top-of-rack data center switch performs at approximately 2Tbps while the most advanced physical firewall performs at 200Gbps in 19 Rack-Unit physical appliances, providing 10% the usable bandwidth. Imagine the network resource utilization bottlenecks created by having to send all east-to-west communication from every VM to every other VM through a physical firewall and how quickly you would run out of available VLANs (limited to 4096) to segment workloads into application-centric pools of security. This is a fundamental architectural constraint created by traditional security architecture that hampers the ability to maintain an adequate security posture within a modern datacenter.

DEFINING MICRO-SEGMENTATION

Micro-segmentation decreases the level of risk and increases the security posture of the modern data center. So what exactly defines micro-segmentation? For a solution to provide micro-segmentation requires a combination of the following capabilities, enabling the ability to achieve the below-noted outcomes.

VMware NSX 101: What, Why & How




Distributed stateful firewalling for topology agnostic segmentation – Reducing the attack surface within the data center perimeter through distributed stateful firewalling and ALGs (Application Level Gateway) on a per-workload granularity regardless of the underlying L2 network topology (i.e. possible on either logical network overlays or underlying VLANs).

VMware NSX Component Overview w Tim Davis @aldtd #vBrownBag #RunNSX


Centralized ubiquitous policy control of distributed services – Enabling the ability to programmatically create and provision security policy through a RESTful API or integrated cloud management platform (CMP).

Granular unit-level controls implemented by high-level policy objects – Enabling the ability to utilize security groups for object-based policy application, creating granular application level controls not dependent on network constructs (i.e. security groups can use dynamic constructs such as OS type, VM name or static constructs such active directory groups, logical switches, VMs, port groups IPsets, etc.). Each applcation can now have its own security perimeter without relying on VLANs . See the DFW Policy Rules Whitepaper for more information.

Andy Kennedy - Scottish VMUG April 2016



Network overlay based isolation and segmentation – Logical Network overlay-based isolation and segmentation that can span across racks or data centers regardless of the underlying network hardware, enabling centrally managed multi-datacenter security policy with up to 16 million overlay-based segments per fabric.

Policy-driven unit-level service insertion and traffic steering – Enabling Integration with 3rd party solutions for advanced IDS/IPS and guest introspection capabilities.

ALIGNMENT WITH EMERGING CYBERSECURITY STANDARDS

National Institute of Standards and Technology (NIST) is the US federal technology agency that works with industry to develop and apply technology, measurements, and standards. NIST is working with standards bodies globally in driving forward the creation of international cybersecurity standards. NIST recently published NIST Special Publication 800-125B, “Secure Virtual Network Configuration for Virtual Machine (VM) Protection” to provide recommendations for securing virtualized workloads.

VMware NSX Switching and Routing with Tim Davis @aldtd #vBrownBag #RunNSX


The capabilities of micro-segmentation provided by NSX map directly to the recommendations made by NIST.

Section 4.4 of NIST 800-125b makes four recommendations for protecting virtual machine workloads within modern data center architecture. These recommendations are as follows

VM-FW-R1: In virtualized environments with VMs running delay-sensitive applications, virtual firewalls should be deployed for traffic flow control instead of physical firewalls, because in the latter case, there is latency involved in routing the virtual network traffic outside the virtualized host and back into the virtual network.

VM-FW-R2: In virtualized environments with VMs running I/O intensive applications, kernel-based virtual firewalls should be deployed instead of subnet-level virtual firewalls, since kernel-based virtual firewalls perform packet processing in the kernel of the hypervisor at native hardware speeds.

VM-FW-R3: For both subnet-level and kernel-based virtual firewalls, it is preferable if the firewall is integrated with a virtualization management platform rather than being accessible only through a standalone console. The former will enable easier provisioning of uniform firewall rules to multiple firewall instances, thus reducing the chances of configuration errors.

VM-FW-R4: For both subnet-level and kernel-based virtual firewalls, it is preferable that the firewall supports rules using higher-level components or abstractions (e.g., security group) in addition to the basic 5-tuple (source/destination IP address, source/destination ports, protocol).

VMworld 2015: Introducing Application Self service with Networking and Security




NSX based micro-segmentation meets the NIST VM-FW-R1, VM-FW-R2 and VM-FW-R3 recommendations in providing the ability to utilize network virtualization based overlays for isolation, and distributed kernel based firewalling for segmentation through ubiquitous centrally managed policy control which can be fully API driven.segmetnation with overlay

VMware NSX - Transforming Security



Micro-segmentation through NSX also meets the NIST VM-FW-R4 recommendation to utilize higher-level components or unit-of-trustabstractions (e.g., security groups) in addition to the basic 5-tuple (source/destination IP address, source/destination ports, protocol) for firewalling. NSX based micro-segmentation can be defined as granularly as a single application or as broad as a data center, with controls that can be implemented by attributes such as who you are or what device is accessing your data center.

MICRO-SEGMENTATION WITH NSX AS A SECURITY PLATFORM

Protection against advanced persistent threats that propagate via targeted users and application vulnerabilities presents a requirement for more than network layer segmentation to maintain an adequate security posture.
These advanced threats require application-level security controls such as application-level intrusion protection or advanced malware protection to protect chosen workloads.  In being a security platform, NSX based micro-segmentation goes beyond the recommendations noted in the NIST publication and enables the ability for fine-grained application of service insertion (e.g. allowing IPS services to be applied to flows between assets that are part of a PCI zone). In a traditional network environment, traffic steering is an all or none proposition, requiring all traffic to steered through additional devices.  With micro-segmentation, advanced services are granularly applied where they are most effective, as close to the application as possible in a distributed manner while residing in separate trust zone outside the application’s attack surface.

Kubernetes and NSX


SECURING PHYSICAL WORKLOADS

While new workload provisioning is dominated by agile compute technologies such as virtualization and  physicalcloud, the security posture of physical workloads still has to be maintained. NSX has the security of physical workloads covered as physical to virtual or virtual to physical communication can be enforced using distributed firewall rules at ingress or egress. In addition, for physical to physical communication NSX can tie automated security of physical workloads into micro-segmentation through centralized policy control of those physical workloads through the NSX Edge Service Gateway or integration with physical firewall appliances. This allows centralized policy management of your static physical environment in addition to your micro-segmented virtualized environment.

CONCLUSION

NSX is the means to provide micro-segmentation through centralized policy controls, distributed stateful firewalling, overlay- based isolation, and service-chaining of partner services to address the security needs of the rapidly evolving information technology landscape. NSX easily meets and goes above and beyond the recommendations made by the National Institute of Standards and Technology for protecting virtualized workloads, secures physical workloads, and paves a path towards securing future workloads with a platform that meets your security needs today and is flexible enough to adapt to your needs tomorrow.

Use a Zero Trust Approach to Protect Against WannaCry

Micro-segmentation with VMware NSX compartmentalizes the data center to contain the lateral spread of ransomware attacks such as WannaCry

On May 12 2017, reports began to appear of the WannaCry malware attacking organizations worldwide in one of the largest ransomware cyber incidents to date. The European Union Agency for Law Enforcement Cooperation (Europol) has reported more than 200,000 attacks in over 150 countries and in 27, with the full scope of the attack yet to be determined.  Victims include organizations from all verticals.

WannaCry targets Microsoft Windows machines, seizing control of computer systems through a critical vulnerability in Windows SMB. It also utilizes RDP as an attack vector for propagation. It encrypts seized systems and demands a ransom be paid before decrypting the system and giving back control. The threat propagates laterally to other systems on the network via SMB or RDP and then repeats the process. An initial analysis of WannaCry by the US Computer Emergency Readiness Team (US-CERT) can be found here, with a detailed analysis from Malware Bytes here.

One foundational aspect of increasing cybersecurity hygiene in an organization to help mitigate such attacks from proliferating is enabling a least privilege (zero trust) model by embedding security directly into the data center network. The core concept of zero trust is to only allow for necessary communication between systems using a stateful firewall, assuming all network traffic is untrusted. This dramatically reduces the attack surface area.

VMware NSX micro-segmentation provides this intrinsic level of security to effectively compartmentalize the data center to contain the lateral spread of ransomware attacks such as WannaCry.

In this blog, focus is on how NSX can help:
  • Contain the spread of the malware such as WannaCry
  • Provide visibility into on-going attacks
  • Identify systems that are still infected
  • Mitigate future risk through a micro-segmentation approach

Stages of the WannaCry cyber attack

Before we provide our attack mitigation recommendations, let us review the WannaCry ransomware attack lifecycle.

Weaponization:
WannaCry uses the EternalBlue exploit that was leaked from the NSA to exploit the MS17-010 vulnerability in Windows. WannaCry then encrypts data on the system including office files, emails, databases, and source code, as well as network shares, using RSA-2048 encryption keys with AES-128 encryption that are extremely difficult to break with current technology. WannaCry ends the “weaponization” stage by posting a message to the user demanding $300 in bitcoin as a ransom in order to decrypt the data.

Installation / Exploitation / Encryption / Command and Control:
WannaCry cycles through every open RDP session since it is also a worm that contains the malware payload that drops itself onto systems and spreads itself. As soon as the ransomware is dropped, it tries to connect to a command and control URL to seize control and encrypt the system. The code has both direct as well a proxy access to the internet. Next step for the worm is to install a service called “mssecsvc2.0” with display name “Microsoft Security Center (2.0) service”. The worm loads the crypto module when the service is installed and proceeds to encrypt the system.

Propagation:
WannaCry enters through email phishing or other means of breaching the network perimeter and scans all of the systems on the network based and spreads laterally from vulnerable system-to-system. Scans are not just restricted to systems actively communicating but also IP addresses obtained via multicast traffic, unicast traffic, and DNS traffic. Once WannaCry obtains a list of IPs to target, it probes port 445 with a randomly generated spoofed source IP address. If the connection on port 445 of a vulnerable system is successful, WannaCry proceeds to infect and encrypt the system. Additionally, it scans for the entire /24 subnet for the system (10 IP addresses at a time), probing for additional vulnerable systems.

Preventing the attack with VMware NSX

NSX can be used to implement micro-segmentation to compartmentalize the data center, containing the lateral spread of ransomware attacks such as WannaCry and achieving a zero trust network security model.


The following are recommendations in order of priority, to create a micro-segmented environment that can interrupt the WannaCry attack lifecycle.

Monitor traffic on port 445 with the NSX distributed firewall. This would provide visibility into SMB traffic, that may include attack traffic or attempts. Once endpoint infection is determined, Allow or Block, logs from NSX can be correlated or analyzed in SIEM, log analyzer or network behavior analyzer.
Enable environmental re-direction rules in NSX so that any traffic destined for critical systems is steered to an NSX-integrated IPS solutions to detect network indicators of this attack. Even if the perimeter did not detect the malware, east-west traffic within the environment can be analyzed to detect the attack indicators.
Create an NSX Security Group for all VMs running the Windows Operating System, to identify potentially vulnerable machines. This is really simple to do in NSX as you can group VMs based on attributes like operating system, regardless of their IP address.
Enable Endpoint Monitoring (NSX 6.3+ feature) on VMs that are part of the Windows operating system to detect mssecsvc2.0. If detected, verify and check what VMs it has started communicating with on port 445.
Create a distributed firewall rule to immediately block/monitor all traffic with a destination port of 445 on the /24 subnet of any VMs that is found on that list.
Use Endpoint Monitoring to detect if mssecssvc2.0 is running on systems that are not patched so that NSX can detect if a new attack starts.
Additional precautions include blocking RDP communication between systems and blocking all desktop-to-desktop communications in VDI environments. With NSX, this level of enforcement can be achieved with a single rule.

Architecting a secure datacenter using NSX Micro-segmentation

With NSX micro-segmentation, organizations can enable a least privilege, zero trust model in their environment. For environments utilizing NSX, the distributed firewall applies security controls to every vNIC of every VM. This controls communications between all VMs in the environment (even if they are on the same subnet), unlike the traditional firewall model in which flows within a subnet are typically not restricted, allowing malware to spread laterally with ease.



With a zero trust architecture enabled by NSX, any non-approved flow will be discarded by default, regardless of what services have been enabled the VM, and ransomware like WannaCry will not be able to propagate – immediately blunting the amount of damage to data center operations and hence the organization.

More Information:

http://static-void.io/nsx-over-aci-explained/

https://www.isaca.org/Journal/archives/2014/Volume-6/Pages/How-Zero-trust-Network-Security-Can-Enable-Recovery-From-Cyberattacks.aspx


https://blogs.vmware.com/networkvirtualization/2016/06/micro-segmentation-defined-nsx-securing-anywhere.html/


https://blogs.vmware.com/networkvirtualization/2016/06/3479.html/


https://blogs.vmware.com/networkvirtualization/2016/07/operationalizing-micro-segmentation-nsx-securing-anywhere-part-3.html/


https://blogs.vmware.com/networkvirtualization/2016/07/micro-segmentation-defined-nsx-securing-anywhere-part-iv.html/


https://blogs.vmware.com/networkvirtualization/2015/12/time-to-build-a-zero-trust-network.html/


https://www.vmware.com/solutions/industry/education/nsxk12.html


https://blogs.vmware.com/networkvirtualization/2017/05/use-zero-trust-protects-against-wannacry.html/


https://www.eventbrite.ca/e/implementing-zero-trust-with-vmware-nsx-and-vrealize-network-insight-tickets-31422460425


https://blogs.vmware.com/apps/2012/02/virtualizing-oracle-11gr2-rac-databases-on-vsphere-5.html


https://blogs.vmware.com/networkvirtualization/service-insertion-pic4/



VMware NSX vSphere Zero-Trust Security Demo





Cloudify 3.4 Brings Open Source Orchestration to Top Five Public Clouds with New Azure Support and Full Support for VMware, OpenStack Private Clouds

$
0
0


Cloudify 3.4 Brings Open Source Orchestration to Top Five Public Clouds

The latest version of Cloudify open source multi-cloud orchestration software—Cloudify 3.4—is now available. It brings pure-play cloud orchestration to every major public and private cloud platform—Amazon Web Services (AWS), Azure, Google Compute Platform (GCP), OpenStack and VMware—as well as cloud native technologies like Kubernetes and Docker. The software is at work across multiple industries and geographies, and it has become a preferred orchestrator for telecom providers deploying network functions virtualization (NFV).

***Cloudify 3.4 is now available for download here.***

Cloudify is the only pure-play, standards-based (TOSCA) cloud orchestration platform that supports every major private and public cloud infrastructure offering. With Cloudify, enterprises can use a single, open source cloud orchestration platform across OpenStack, VMware or AWS clouds, with virtualization approaches such as VMs or containers and with different automation toolsets like Puppet, Chef or Saltstack. Because it provides an easy-to-use, open source tool for management and orchestration (MANO) of multiple clouds, data centers and availability zones, Cloudify is attractive to telecoms, internet service providers, and enterprises using hybrid cloud.



Key Feature Summary

Enterprise-Grade Enhanced Hybrid Cloud Support - supports all major public and private cloud environments, including AWS, Azure, GCP, OpenStack and VMWare vSphere and vCloud
Support for Entire VMware Stack - the only open source orchestration platform supporting the entire VMware stack; all VMware plugins are open source and available in the Cloudify Community edition
Public Shared Images for both AWS and OpenStack - prebaked Cloudify Manager environments now available for AWS through a shared AMI, and OpenStack through a QCOW image; enables simple bootstrapping of a full-fledged Cloudify environment in minutes
Deployment Update - allows updating of application deployments, enabling application operations engineers and developers to introduce topology changes and include new resources to run TOSCA deployments
In-Place Manager Upgrade - the new Cloudify Manager upgrade process provides fully automated in-place upgrades for all manager infrastructure without any downtime to the managed services; in-place upgrade will allow easy migration between Cloudify versions and application of patched versions

Cloudify 3.4 Enhanced for Hybrid Cloud, Microservices

OpenStack Ottawa Meetup - March 29th 2017




The new release enhances Cloudify usability among enterprises looking for hybrid cloud orchestration without compromising on solutions that cater to the least common denominator of API abstraction. It does this by offering greater support of IaaS, enhanced usability, quicker installation and improved maintenance processes. Cloudify 3.4 introduces plugins for Microsoft Azure and GCP, complementing the existing portfolio of plugins for OpenStack, AWS and VMware vSphere and vCloud, which are now all open source. The new release also enhances support for container orchestration and container lifecycle management, including microservices modeling and enhanced support for Kubernetes.

The New Hybrid Stack with New Kubernetes Support

Cloudify 3.4 adds support for the Kubernetes container management project, enabling users to manage hybrid stacks that include both microservices on top of Kubernetes, alongside stateful services such as backends on bare-metal and VMs. It also manages composition and dependency management between services, as well as triggering of auto-scaling of both the micro-services and Kubernetes minions.

Continuous Deployment Across Clouds

Managing applications across hybrid environments and stacks goes far beyond infrastructure-layer orchestration. DevOps processes can be difficult to apply in hybrid cloud environments such as continuous deployment across clouds. Cloudify 3.4 comes with a new set of features that enables the pushing of updates to both the application and the infrastructure itself.

OpenStack Benefits for VMware



Cloudify for Telecom Operators

Cloudify 3.4 continues the open disruption in telecom and strengthens even further the offering for telecom service providers with its “Cloudify for NFV MANO (Management and Orchestration)” offering, which includes a robust set of new features, NFV-specific plugins, and blueprints showcasing modeling of VNFs (Virtual Network Functions) and SFC (Service Function Chaining) using TOSCA.

Media Resources

Cloudify 3.4 Has Landed - Learn More
Hybrid Cloud Blog Posts
Online Kubernetes Lab and Hybrid Cloud Module
Hybrid Cloud in Production Webinar
New Cloudify Telco Edition

About GigaSpaces

GigaSpaces Technologies provides software for cloud application orchestration and scaling of mission-critical applications on cloud environments. Hundreds of tier-one organizations worldwide are leveraging GigaSpaces technology to enhance IT efficiency and performance, including top financial firms, e-commerce companies, online gaming providers, healthcare organizations and telecom carriers. GigaSpaces has offices in the US, Europe and Asia. More at www.gigaspaces.com and getcloudify.org

Microsoft introduces Azure Stack, its answer to OpenStack

Microsoft has taken the wraps off Azure Stack, its take on hybrid cloud infrastructure and response to the popular OpenStack open-source cloud computing package. Azure Stack will begin shipping in September.



Azure Stack was originally designed as a software-only product, much like OpenStack. But Microsoft has decided to add integrated hardware turnkey solutions from its certified partners such as Dell EMC, HPE, Lenovo, Cisco and Huawei.

Microsoft first announced Azure Stack at the Ignite Conference in 2015 and formally introduced it at the Inspire conference in Washington, D.C.

Azure Stack is basically the same APIs, tools and processes that power Azure, but it’s intended to be hosted on-premises in private cloud scenarios. By offering the same platform and tools both on-premises and in Azure, the company promises consistency and ease of deployment, whether it’s hosted locally or in the cloud.

It also makes it possible to deploy different instances of the same app for meeting regulatory compliance, such as a financial app with different business or technical requirements, or perhaps regulatory limits on what can go into the cloud. But both apps can be based on the same codebase and one slightly altered for the cloud.

The Cloud On Your Terms - Azure PaaS Overview


“The ability to run consistent Azure services on-premises gets you full flexibility to decide where applications and workloads should reside,” said Mike Neil, corporate vice president for Azure Infrastructure and Management, in the blog post accompanying the announcement.

Azure Stack will use two pricing models: pay-as-you-use, similar to what you would get with the Azure service, and capacity-based, where customers will pay a fixed annual fee based on the number of physical cores in a system.

Omni v2.0 GCE, Azure integration and support for multiple regions


There will also be an option of having Azure Stack delivered and operated as a fully managed service. The services will be managed by data center operators such as Avanade, Daisy, Evry, Rackspace and Tieto. These companies are already delivering services around Azure.

Microsoft has said that its goal is to ensure that most ISV applications and services that are certified for Azure will work on Azure Stack. ISVs such as Bitnami, Docker, Kemp Technologies, Pivotal Cloud Foundry, Red Hat Enterprise Linux and SUSE Linux are working to make their solutions available on Azure Stack.

Microsoft also announced the Azure Stack Development Kit (ASDK), a free single-server deployment SDK for building and validating applications on the Azure Stack.





Throughout the Technical Previews, we’ve seen tremendous customer and partner excitement around Microsoft Azure Stack. In fact, we’re speaking with thousands of partners this week at our Microsoft Inspire event. Our partners are excited about the new business opportunities opened up by our ‘One Azure Ecosystem’ approach, which helps them extend their Azure investments to Azure Stack, to unlock new possibilities for hybrid cloud environments. In that vein, today we are announcing:

Orderable Azure Stack integrated systems: We have delivered Azure Stack software to our hardware partners, enabling us to begin the certification process for their integrated systems, with the first systems to begin shipping in September. You can now order integrated systems from Dell EMC, HPE, and Lenovo.
Azure Stack software pricing and availability: We have released pricing for the pay-as-you-use and capacity-based models today, you can use that information to plan your purchases.
Azure Stack Development Kit (ASDK) availability: ASDK, the free single-server deployment option for trial purposes, is available for web download today. You can use it to build and validate your applications for integrated systems deployments.

Azure Stack promise

Azure Stack is an extension of Azure, thereby enabling a truly consistent hybrid cloud platform. Consistency removes hybrid cloud complexity, which helps you maximize your investments across cloud and on-premises environments. Consistency enables you to build and deploy applications using the exact same approach – same APIs, same DevOps tools, same portal – leading to increased developer productivity. Consistency enables you to develop cloud applications faster by building on Azure Marketplace application components. Consistency enables you to confidently invest in people and processes knowing that those are fully transferable. The ability to run consistent Azure services on-premises gets you full flexibility to decide where applications and workloads should reside. An integrated systems-based delivery model ensures that you can focus on what matters to your business (i.e., your applications), while also enabling us to deliver Azure innovation to you faster.

In its initial release, Azure Stack includes a core set of Azure services, DevOps tooling, and Azure Marketplace content, all of which are delivered through an integrated systems approach. Check out this whitepaper for more information about what capabilities are available in Azure Stack at the initial release and what is planned for future versions.

Hybrid use cases unlock application innovation

Azure and Azure Stack unlock new use cases for customer facing and internal line of business applications:

Edge and disconnected solutions: You can address latency and connectivity requirements by processing data locally in Azure Stack and then aggregating in Azure for further analytics, with common application logic across both. We’re seeing lots of interest in this Edge scenario across different contexts, including factory floor, cruise ships, and mine shafts.
Cloud applications that meet varied regulations: You can develop and deploy applications in Azure, with full flexibility to deploy on-premises on Azure Stack to meet regulatory or policy requirements, with no code changes needed. Many customers are looking to deploy different instances of the same application – for example, a global audit or financial reporting app – to Azure or Azure Stack, based on business and technical requirements. While Azure meets most requirements, Azure Stack enables on-premises deployments in locations where it’s needed. Saxo Bank is a great example of an organization who plan to leverage the deployment flexibility enabled by Azure Stack.
Cloud application model on-premises: You can use Azure web and mobile services, containers, serverless, and microservice architectures to update and extend existing applications or build new ones. You can use consistent DevOps processes across Azure in the cloud and Azure Stack on-premises. We’re seeing broad interest in application modernization, including for core mission-critical applications. Mitsui Knowledge Industry is a great example of an organization planning their application modernization roadmap using Azure Stack and Azure.
Ecosystem solutions across Azure and Azure Stack



You can speed up your Azure Stack initiatives by leveraging the rich Azure ecosystem:

Our goal is to ensure that most ISV applications and services that are certified for Azure will work on Azure Stack. Multiple ISVs, including Bitnami, Docker, Kemp Technologies, Pivotal Cloud Foundry, Red Hat Enterprise Linux, and SUSE Linux, are working to make their solutions available on Azure Stack.
You have the option of having Azure Stack delivered and operated as a fully managed service. Multiple partners, including Avanade, Daisy, Evry, Rackspace, and Tieto, are working to deliver managed service offerings across Azure and Azure Stack. These partners have been delivering managed services for Azure via the Cloud Solution Provider (CSP) program and are now extending their offerings to include hybrid solutions.
Systems Integrators (SI) can help you accelerate your application modernization initiatives by bringing in-depth Azure skillsets, domain and industry knowledge, and process expertise (e.g., DevOps). PriceWaterhouseCoopers (PwC) is a great example of an SI that’s expanding their consulting practice to Azure and Azure Stack.
Orderable integrated systems, free single-server kit for trial
Azure Stack has two deployment options:

Azure Stack integrated systems – These are multi-server systems meant for production use, and are designed to get you up and running quickly. Depending upon your hardware preferences, you can choose integrated systems from Dell EMC, HPE, and Lenovo (with Cisco and Huawei following later). You can now explore these certified hardware solutions and order integrated systems by contacting our hardware partners. These systems come ready to run and offer consistent, end-to-end customer support no matter who you call. They will initially be available in 46 countries covering key markets across the world.
Azure Stack Development Kit (ASDK)– ASDK is a free single server deployment that’s designed for trial and proof of concept purposes. ASDK is available for web download today, and you can use it to prototype your applications. The portal, Azure services, DevOps tools, and Marketplace content are the same across this ASDK release and integrated systems, so applications built against the ASDK will work when deployed to a multi-server system.
Closing thoughts
As an extension of Azure, Azure Stack will deliver continuous innovation with frequent updates following the initial release. These updates will help us deliver enriched hybrid application use cases, as well as grow the infrastructure footprint of Azure Stack. We will also continue to broaden the Azure ecosystem to enable additional choice and flexibility for you.


Cloud Orchestration with Azure and OpenStack – The Less Explored Hybrid Cloud




Often times, when hybrid cloud is discussed, the natural choices for such a discussion usually center around OpenStack coupled with VMware, or AWS coupled with OpenStack, and even diverse clouds and container options – but Azure coupled with OpenStack is a much less common discussion.

OpenStack Summit Vancouver 2018



This is actually quite an anomaly when you think about it as both Azure’s public cloud, and OpenStack’s private cloud are highly enterprise-targeted.  With Azure boasting enterprise-grade security and encryption, and even offering their newly announced Azure Stack aimed at helping enterprises bridge the gap between their data centers and the cloud, and OpenStack’s inherent openness of APIs enabling enterprises to build their own cloud, these should naturally fit together in the cloud landscape.  Yet this combination is surprisingly often overlooked.

Free, open source, hybrid cloud orchestration – need I say more?  Get Cloudify



Nati Shalom, recently discussed in his post Achieving Hybrid Cloud Without Compromising On The Least Common Denominator, a survey that demonstrates that enterprises these days are often leveraging as many as six clouds simultaneously, and the list just keeps on growing with new technologies sprouting up by the minute.

That’s why solutions like the Azure Stack, that are also geared towards multi-cloud scenerios in the context of app migration to the cloud from traditional data centers, especially while taking all of the enterprise-grade considerations involved in such a transition into account, are critical.

Project Henosis Unified management of VMs and Container based infrastructure for OpenStack


Historically, in order to achieve cloud portability you would need to cater to the least common denominator by abstracting your application from all of the underlying logic of the infrastructure below, but this type of model comes at a costly price.  All of the actual advantages the specific cloud provides.  What if there were a better approach?  A way to achieve interoperability, and extensibility between clouds, all while taking full advantage of the underlying clouds capabilities and service portfolio.

Highlights of OpenStack Mitaka and the OpenStack Summit




But even so, many solutions these days don’t always provide the extensibility and interoperability enterprises these days need for future-proofing, application deployment portability among other popular use cases across clouds.  Hybrid cloud itself has also has proven that it isn’t immune to future proofing with disruptive technologies arising every day – not unlike, the latest and greatest containers (read more on The Disruption Cycle).  This means that the new approach needs to actually be built for hybrid stacks, not just clouds, all while providing the full functionality the underlying infrastructure provides.



Enter TOSCA (the standard by the Oasis Foundation for cloud applications).  TOSCA was written for this exact scenario, and provides inherent cloud interoperability and agnosticism.  The TOSCA approach is intended to standardize the way applications are meant to be orchestrated in cloud environments.  And enterprises love standards.  Building one syntax and vocabulary enables organizations to adapt to the fast-paced world of cloud in a substantially simplified manner.

Security for Cloud Computing: 10 Steps to Ensure Success V3.0




Cloudify, based on TOSCA, built from the ground up as an integration platform is leveraging standardized templating, workflows, and cloud plugins to provide a single pane of glass across technologies that wouldn’t natively or intuitively plugin to each other, such as OpenStack and Azure, and even Kubernetes or Docker, and non-virtualized environments like traditional data centers.  Cloudify is making it possible to choose a technology that adapts to the way your organization works or would like to work, and not require you to adapt your technologies, stacks or practices to the technologies you adopt.

Templating languages, such as TOSCA, enable far greater flexibility for abstraction than API abstraction providing the level of extensibility and customization that enterprises require, without the need to develop or change the underlying implementation code, this is why major projects such as ARIA, Tacker, and OpenStack Heat are building solutions based on this standard.



In this way, Azure users now have a set of building blocks for managing the entire application stack and its lifecycle, across clouds, stacks and technologies. And with Microsoft now proudly having the most open source developers on GitHub, yup – ahead of Facebook, Angular, and even Google & Docker amazingly – Azure is uniquely positioned to achieve this level of openness and interoperability.

Montreal Linux MeetUp - OpenStack Overview (2017.10.03)




This will also ultimately provide a higher degree of flexibility that allows users to define their own level of abstraction per use case or application.  In this manner, cloud portability is achievable without the need to change the underlying code, enabling true hybrid cloud.

China's largest OpenStack Cloud accelerates the science discovery of AMS-02


More Information:

https://cloudify.co/2018/01/15/cloudify-kubernetes-plugin-2-0-more-k8s-awesomeness/

https://cloudify.co/product/

https://wp.cloudify.co/tag/cloud-orchestration/

https://wp.cloudify.co/2017/08/08/introducing-cloudify-kubernetes-plugin-orchestrating-deployment-applications-k8s-clusters-multi-cloud-environment/

https://www.businesswire.com/news/home/20180201005855/en/Cloudify-Selected-NTT-DATA-INTELLILINK-Corporation-Preferred


http://docs.getcloudify.org/4.2.0/plugins/container-support/

https://wp.cloudify.co/authors/shay-naeh/

https://cloudify.co/blog/

http://getcloudify.org.s3-website-us-east-1.amazonaws.com/cloudify-3.4-enterprise-hybrid-cloud-kubernetes-cloud-orchestration-amazon-openstack-gcp-azure.html

https://www.gigaspaces.com/cloudify-34-brings-open-source-orchestration-top-five-public-clouds-new-azure-support-and-full-sup-0

Announcing General Availability of Red Hat CloudForms 4.6

$
0
0



CloudForms and Ansible Integration




Red Hat CloudForms 4.6, as announced in the recent Press Release. One of the key highlights of the release is the introduction of Lenovo XClarity as the first physical infrastructure provider, enabling CloudForms to go beyond hybrid cloud management and manage hybrid infrastructure.

CloudForms 4.6 continues to build on the automation-centric approach to multi-cloud management that was introduced in 4.5, aligning with Red Hat’s vision to simplify IT management with Ansible’s powerful automation capabilities.

Additional enhancements focus on provider capabilities and usability. Let’s take a closer look at what’s new in CloudForms 4.6, and be sure to check back in on this blog for more detailed posts on many of these new capabilities in the coming weeks.

Red Hat Management Demos




New Lenovo XClarity Provider: enables CloudForms to discover and manage Lenovo physical compute infrastructure alongside virtual and multi-cloud through a single pane of glass.

Ansible Automation Inside:  
  • Call Ansible playbooks as methods in state machines, allowing for hybrid Ruby and Ansible orchestration.
  • Compute resource linking in services, providing visibility of Ansible deployed compute items.
  • Provide a foundational layer to curate Ansible modules, adding secure authentication for Ansible callbacks to CloudForms.
  • Support additional Ansible credentials, including OpenStack, Azure, Google, Satellite, Subversion, GitLab, as well as Ansible Networking.


Additional provider enhancements:  Red Hat OpenShift Container Platform, Red Hat OpenStack, Red Hat Virtualization

Usability enhancements for the Administrative User Interface:
  • Dynamic Resource Objects to quickly add the capability to provision and collect data on resources not supported by Red Hat CloudForms
  • Prometheus Alert Management
  • New service editor for easier service design
  • Create custom buttons in the Administrative Interface for frequent actions


Operations User Interface:
  • Enhanced snapshot management with more views for increased visibility
  • Improved user experience for resource details
  • Enhanced service dialog with validation of dialog fields as you type and more tool tips
  • Create custom buttons in the Operations User Interface for frequent actions
  • Additional Operations User Interface customization options to meet customer requirements for branding and access control

Red Hat CloudForms 4.6

Red Hat CloudForms 4.6 builds on the automation-centric foundation to multi-cloud management introduced in CloudForms 4.5, including increased support for automated infrastructure provisioning and scaling of Red Hat OpenShift Container Platform and Red Hat Openstack Platform deployments. CloudForms 4.6 is designed to make more Ansible capabilities available natively within CloudForms, including the ability for CloudForms to execute Ansible playbooks and visibility and linking into Ansible-deployed compute resources.

Integrate Openshift with Cloudforms



Red Hat CloudForms 4.6 also introduces Lenovo XClarity as the first physical infrastructure provider, enabling CloudForms to go beyond hybrid cloud management and manage hybrid infrastructure. The new Lenovo XClarity provider enables CloudForms to discover and manage physical compute infrastructure alongside virtual and multi-cloud through a single pane of glass. This view helps deliver valuable insight to system administrators to determine on-premise capacity and analyze the impacts of infrastructure modifications on workload and control infrastructure maintenance.

This video demonstrates how you can take manual tasks and processes and turn them into automation workflows. In this video we utilize Red Hat CloudForms and Ansible Tower to provide an underlying automation and orchestration framework to deliver automation to your IT organization.

Containers, OpenShift, Kubernetes all with Red Hat CloudForms



The demonstration shows how a user can order a service and have automation provision and deliver the resources while tracking the elements in a ticketing system (ServiceNow).

At a high level, the following areas are demonstrated:
  • Ordering an instance inside CloudForms self-service portal
  • CloudForms auto approval and quota escalation features
  • Ansible Tower’s powerful and intuitive workflows
  • Integration into third party web services (ServiceNow and Microsoft Azure)





This technical presentation details the integration points and technical value of all 4 Red Hat® Cloud Infrastructure components: Red Hat Enterprise Linux® OpenStack® Platform, Red Hat Enterprise Virtualization, Red Hat CloudForms, and Red Hat Satellite. This session will also illustrate several different deployment scenarios that this flexible offering allows. In addition, you'll learn about common integration …Full session details here: https://www.redhat.com/en/technologies/cloud-computing/cloud-infrastructure and http://itinfrastructure.report/view-resource.aspx?id=958. and here https://www.openstack.org/videos/


The definitive OpenStack Map

Presenting the OpenStack map, the process that went through its creation, and the next steps.


Automating CloudForms Appliance Deployment with Ansible

Red Hat CloudForms ships as an appliance to simplify deployment as much as possible – a Red Hat Enterprise Linux server with the appropriate software loaded, ready to be configured with a few basic configuration options.

Traditionally, these servers are configured using the command line tool appliance_console. This is a simple, menu-based interface that allows you to configure the core functionality of the appliance and makes it exceptionally easy to do so. Unfortunately, menu-based interfaces don’t lend themselves to being automated easily.

However, there is a solution!

Openstack Cloud Management and Automation Using Red Hat Cloudforms 4.0



All CloudForms appliances ship with another tool called appliance_console_cli. We can combine this tool with an Ansible playbook to automate the configuration of our appliance(s).

Before we go further, take a look at the sample playbook located on Github. This playbook shows a simple scenario that configures two appliances:

A primary database in which we use a separate disk and configure an internal VMDB
A non-VMDB appliance which joins the region in the primary database.
The playbook sets some standard configuration for all the appliances – namely a common root password, and an appropriate hostname – then uses the appliance_console_cli tool through the Ansible shell module.

Let’s take a look at some of the key options available to appliance_console_cli, as of CloudForms 4.5. This isn’t an exhaustive list, so have a look at the help output of the command to see them all:

Server configuration options

–host: set the hostname for the appliance. Also updates your /etc/hosts – handy!
–ipaserver, –ipaprincipal, –ipapassword, –ipadomain and –uninstall: establish this host in an IPA realm, using the principal and password you provide. Note the principal must have the privileges needed to register the host and register a service.
–logdisk, –tmpdisk: specify the devices used for the log and tmp directories.

Database options

–region: the region for the appliance; needed when establishing a database
–internal: specify this if you want to create an internal database (i.e. you’re not connecting to a remote postgresql db)
–hostname, –port, –username, –password, –dbname: key details for your database. Without the –internal parameter, these are used to join your appliance to an external database.
–dbdisk: specify a device to use for the postgresql data directory. Very handy!

Preparing the appliance

–fetch-key, –sshlogin, –sshpassword: fetch the v2_key encryption key from a remote appliance with the provided SSH login credentials. All appliances connected to a VMDB need the same v2_key!

CloudForms 4.6 extends the commands of appliance_console_cli and brings it closer to feature parity with appliance_console. A major improvement is the ability to configure database replication on the command line, just by running different parameters on your primary and standby nodes. Super useful! This will be the focus of a future article, and I’ll extend the playbook to deploy two VMDB appliances in a primary/standby configuration.

What are you waiting for? Head to Red Hat Customer Portal and try out the CloudForms 4.6 Beta! General Availability is just around the corner…

Ansible Automation

Don’t forget, the upcoming release of CloudForms 4.6 brings improved embedded Ansible Automation Inside capabilities. If you are not familiar, Embedded Ansible has been a feature of CloudForms since version 4.5 and allows to store and execute Ansible playbooks from within CloudForms.

For example, Ansible Automation allows to execute a playbook as part of a Service Catalog request to configure provisioned VMs for the requester. Alternatively, a playbook can be executed when a user interface button is pressed, or in response to an event or alert.

Automating the Enterprise with CloudForms & Ansible



Ansible Modules and CloudForms

Ansible 2.4 provides Ansible modules to manage CloudForms: manageiq_provider and manageiq_user. These modules use the CloudForms REST API to automate the configuration of providers and users.

Combining these configuration modules and the playbook above allow to provision and configure CloudForms appliances, define users in the VMDB, and configure new providers – all in a single play!

Conclusion

Ansible is being embedded throughout all cloud software platform at Red Hat, and CloudForms is no exception. Keep an eye out for future posts in this series, where we will test drive some of the new features of appliance_console_cli in the upcoming 4.6 release.

More Information:

https://www.redhat.com/en/about/press-releases/red-hat-extends-cloudforms-multi-cloud-management-ansible-automation


https://redhatstackblog.redhat.com/2015/05/13/public-vs-private-amazon-compared-to-openstack/


https://www.redhat-cloudstrategy.com


https://www.redhat-cloudstrategy.com/how-to-manage-the-cloud-journey/


https://cloudformsblog.redhat.com/2018/02/23/cloudforms-database-high-availability-explained/#more-2336


https://cloudformsblog.redhat.com/2017/12/19/red-hat-cloudforms-2017-in-review/


https://cloudformsblog.redhat.com/2017/12/07/debugging-ansible-automation-inside-red-hat-cloudforms/


https://cloudformsblog.redhat.com/2018/01/24/automating-cloudforms-appliance-deployment-with-ansible/


https://cloudformsblog.redhat.com/2017/05/31/ansible-automation-inside-cloudforms/


https://cloudformsblog.redhat.com/2018/02/13/automating-instance-provisioning-with-cloudforms-and-ansible-tower-video/#more-2329

Microsoft Azure Databricks

$
0
0


Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform.





A fast, easy, and collaborative Apache Spark™ based analytics platform optimized for Azure


Designed in collaboration with Microsoft, Azure Databricks combines the best of Databricks and Azure to help customers accelerate innovation with one-click set up, streamlined workflows and an interactive workspace that enables collaboration between data scientists, data engineers, and business analysts.



INCREASE PRODUCTIVITY AND COLLABORATION
Bring teams together in an interactive workspace. From data gathering to model creation, use Databricks notebooks to unify the process and instantly deploy to production. Launch your new Spark environment with a single click. Integrate effortlessly with a wide variety of data stores and services such as Azure SQL Data Warehouse, Azure Cosmos DB, Azure Data Lake Store, Azure Blob storage, and Azure Event Hub. Add artificial intelligence (AI) capabilities instantly and share insights through rich integration with Power BI.


BUILD ON A SECURE, TRUSTED CLOUD
Protect your data and business with Azure Active Directory integration, role-based controls, and enterprise-grade SLAs. Get peace of mind with fine-grained user permissions, enabling secure access to Databricks notebooks, clusters, jobs and data.


SCALE WITHOUT LIMITS
Globally scale your analytics and data science projects. Build and innovate faster using machine learning capabilities. Add capacity instantly. Reduce cost and complexity with a fully-managed, cloud-native platform. Target any size data or project using a complete set of analytics technologies including SQL, Streaming, MLlib, and Graph.

Introduction to Azure Databricks



Big-data company Databricks Inc. made its flagship analytics platform available as an integrated service within Microsoft Corp.’s Azure public cloud.

The service, called Microsoft Azure Databricks, is designed to help customers better process massive amounts of data stored in Microsoft’s cloud, the companies said.

Databricks has grown to become one of the most recognized players on the big-data scene. The company was formed by the creators of the Spark research project at the University of California at Berkeley, which later became the popular open-source big data processing framework called Apache Spark. Databricks was founded to commercialize that software through its Unified Analytics Platform, which is analytics service based on Spark that’s increasingly being used to power modern workloads such as artificial intelligence.

In a blog post, Microsoft Vice President of Azure Data Rohan Kumar and Databricks Chief Executive Officer Ali Ghodsi revealed that Azure Databricks was the fruit of more than two years of collaboration. The executives said the companies began working on the service in response to customer requests for a version of Databricks that’s compatible with Azure. The service, introduced in beta last November, is now being made generally available.



“We experienced a lot of interest and engagement in the preview from organizations in need of a high-performance analytics platform based on Spark,” Kumar said. “With Azure Databricks, deeply integrated with services like Azure SQL Data Warehouse, our customers are now positioned to increase productivity and collaboration and globally scale analytics and data science projects on a trusted, secure cloud environment.”

Azure Databricks has been designed to help make things easier for customers. Rather than doing all the heavy lifting that comes with deploying Databricks in their own data centers, customers can simply access the service via the Azure cloud. Azure Databricks also provides greater compatibility with Microsoft’s own services.

With Azure Databricks it becomes possible to take data from other services and prepare it and process it using machine learning algorithms. From there, the data can also be streamed to other services such as CosmosDB and PowerBI, the executives said.

Azure Databricks was chiefly designed to fulfill companies’ interest in using data to power their artificial intelligence systems. To that end, the service was built with three design principles in mind. The first is enhancing user productivity in developing Big Data applications and analytics pipelines. The second principle was to build a system that could scale almost infinitely without skyrocketing costs. Third, the companies had to ensure that the new service met strict security and compliance standards for enterprises.



“Azure Databricks protects customer data with enterprise-grade SLAs, simplified security and identity, and role-based access controls with Azure Active Directory integration,” the executives said. “As a result, organizations can safeguard their data without compromising productivity of their users.”

“This speaks to the increasing power of cloud services,” said Rob Enderle, principal analyst at the Enderle Group. “Databricks is analytics at scale and this effort should put the analysis engine far closer to the massive amounts of data already being placed on Azure. The result should be a combination of higher performance and lower cost for analytics at massive scale.”





Databricks, provider of the leading Unified Analytics Platform and founded by the team who created Apache Spark™, will showcase its Unified Analytics Platform as a Silver sponsor (booth #1111) at the Gartner Data & Analytics Summit 2018 held March 5-8 in Grapevine, Texas. Hundreds of organizations are leveraging Databricks’ Unified Analytics Platform as a simplified approach for data science and data engineering teams to accelerate innovation and make data-driven business decisions based on big data analytics and artificial intelligence (AI).  Databricks, recently named a Visionary in the Gartner Magic Quadrant for Data Science and Machine-Learning Platforms 2018, focuses on making Big Data and AI simple for enterprise organizations.



The Gartner Data & Analytics Summit will offer a holistic view of current trends and topics around data management, business intelligence (BI), and analytics, including innovative technologies such as AI, blockchain and IoT. Enterprises attend the Summit to learn about the shift toward a data-driven culture to lead the way to better business outcomes. Databricks’ Unified Analytics Platform directly addresses organizations’ issues associated with AI adoption and deployment, making this technology suitable for all businesses.

Databricks’ Unified Analytics Platform is a cloud-based platform powered by Apache Spark, the most popular open source technology for big data processing and machine learning workloads.



“Most data and analytics leaders realize that when it comes to embarking on new AI and Machine Learning initiatives, it’s still really about the data first and foremost.  Their teams need to figure out how you get a massive amount of data, often in real-time, to your model in a way that supports an iterative process and generates a meaningful business result,” said Rick Schultz, chief marketing officer at Databricks. “The Databricks Unified Analytics Platform addresses precisely this problem and, as such, we expect strong engagement from the attendees of Gartner Data & Analytics Summit, many of whom already use Spark.”



To expand the global reach of the Unified Analytics Platform, Databricks recently announced a joint product partnership with Microsoft. The new alliance addresses customer demand for Spark on Microsoft Azure by offering the Unified Analytics Platform as a First Party Service called Azure Databricks. This new integrated service makes it easier for organizations around the globe to derive value from their Big Data and realize the promise of AI.  With Azure Databricks, customers can accelerate innovation with one-click set up and effortless integration with a wide variety of Microsoft data stores and services.

About Databricks
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Founded by the team who created Apache Spark™, Databricks provides a Unified Analytics Platform for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP. For more information, visit www.databricks.com.

Microsoft Azure Databricks - Azure Power Lunch


Azure is the best place for Big Data & AI
We are excited to add Azure Databricks to the Azure portfolio of data services and have taken great care to integrate it with other Azure services to unlock key customers scenarios.

High-performance connectivity to Azure SQL Data Warehouse, a petabyte scale, and elastic cloud data warehouse allows organizations to build Modern Data Warehouses to load and process any type of data at scale for enterprise reporting and visualization with Power BI. It also enables data science teams working in Azure Databricks notebooks to easily access high-value data from the warehouse to develop models.

Integration with Azure IoT Hub, Azure Event Hubs, and Azure HDInsight Kafka clusters enables enterprises to build scalable streaming solutions for real-time analytics scenarios such as recommendation engines, fraud detection, predictive maintenance, and many others.

Integration with Azure Blob Storage, Azure Data Factory, Azure Data Lake Store, Azure SQL Data Warehouse, and Azure Cosmos DB allows organizations to use Azure Databricks to clean, join, and aggregate data no matter where it sits.

We are committed to making Azure the best place for organizations to unlock the insights hidden in their data to accelerate innovation. With Azure Databricks and its native integration with other services, Azure is the one-stop destination to easily unlock powerful new analytics, machine learning, and AI scenarios.

Lift, shift, and modernize Apps using containers on Azure Service Fabric


Apache Spark + Databricks + Enterprise Cloud = Azure Databricks
Once you manage data at scale in the cloud, you open up massive possibilities for predictive analytics, AI, and real-time applications. Over the past five years, the platform of choice for building these applications has been Apache Spark: with a massive community at thousands of enterprises worldwide, Spark makes it possible to run powerful analytics algorithms at scale and in real time to drive business insights. Managing and deploying Spark at scale has remained challenging, however, especially for enterprise use cases with large numbers of users and strong security requirements.

Enter Databricks. Founded by the team that started the Spark project in 2013, Databricks provides an end-to-end, managed Apache Spark platform optimized for the cloud. Featuring one-click deployment, autoscaling, and an optimized Databricks Runtime that can improve the performance of Spark jobs in the cloud by 10-100x, Databricks makes it simple and cost-efficient to run large-scale Spark workloads. Moreover, Databricks includes an interactive notebook environment, monitoring tools, and security controls that make it easy to leverage Spark in enterprises with thousands of users.

In Azure Databricks, we have gone one step beyond the base Databricks platform by integrating closely with Azure services through collaboration between Databricks and Microsoft. Azure Databricks features optimized connectors to Azure storage platforms (e.g. Data Lake and Blob Storage) for the fastest possible data access, and one-click management directly from the Azure console. This is the first time that an Apache Spark platform provider has partnered closely with a cloud provider to optimize data analytics workloads from the ground up.



Benefits for Data Engineers and Data Scientists
Why is Azure Databricks so useful for data scientists and engineers? Let’s look at some ways:

OPTIMIZED ENVIRONMENT
Azure Databricks is optimized from the ground up for performance and cost-efficiency in the cloud. The Databricks Runtime adds several key capabilities to Apache Spark workloads that can increase performance and reduce costs by as much as 10-100x when running on Azure:

High-speed connectors to Azure storage services such as Azure Blob Store and Azure Data Lake, developed together with the Microsoft teams behind these services.
Auto-scaling and auto-termination for Spark clusters to automatically minimize costs.
Performance optimizations including caching, indexing, and advanced query optimization, which can improve performance by as much as 10-100x over traditional Apache Spark deployments in cloud or on-premise environments.

SEAMLESS COLLABORATION
Remember the jump in productivity when documents became truly multi-editable? Why can’t we have that for data engineering and data science? Azure Databricks brings exactly that. Notebooks on Databricks are live and shared, with real-time collaboration, so that everyone in your organization can work with your data. Dashboards enable business users to call an existing job with new parameters. And Databricks integrates closely with PowerBI for interactive visualization.  All this is possible because Azure Databricks is backed by Azure Database and other technologies that enable highly concurrent access, fast performance and geo-replication.

EASY TO USE
Azure Databricks comes packaged with interactive notebooks that let you connect to common data sources, run machine learning algorithms, and learn the basics of Apache Spark to get started quickly. It also features an integrated debugging environment to let you analyze the progress of your Spark jobs from within interactive notebooks, and powerful tools to analyze past jobs. Finally, other common analytics libraries, such as the Python and R data science stacks, are preinstalled so that you can use them with Spark to derive insights. We really believe that big data can become 10x easier to use, and we are continuing the philosophy started in Apache Spark to provide a unified, end-to-end platform.

Architecture of Azure Databricks
So how is Azure Databricks put together? At a high level, the service launches and manages worker nodes in each Azure customer’s subscription, letting customers leverage existing management tools within their account.

Microsoft Data Platform - What's included



Specifically, when a customer launches a cluster via Databricks, a “Databricks appliance” is deployed as an Azure resource in the customer’s subscription.   The customer specifies the types of VMs to use and how many, but Databricks manages all other aspects. In addition to this appliance, a managed resource group is deployed into the customer’s subscription that we populate with a VNet, a security group, and a storage account. These are concepts Azure users are familiar with. Once these services are ready, users can manage the Databricks cluster through the Azure Databricks UI or through features such as autoscaling. All metadata (such as scheduled jobs) is stored in an Azure Database with geo-replication for fault tolerance.

Azure Databricks Architecture



For users, this design means two things. First, they can easily connect Azure Databricks to any storage resource in their account, e.g., an existing Blob Store subscription or Data Lake. Second, Databricks is managed centrally from the Azure control center, requiring no additional setup.

Is the traditional data warehouse dead?




Total Azure Integration
We are integrating Azure Databricks closely with all features of the Azure platform in order to provide the best of the platform to users. Here are some pieces we’ve done so far:

Diversity of VM types:  Customers can use all existing VMs: F-series for machine learning scenarios, M-series for massive memory scenarios, D-series for general purpose, etc.
  • Security and Privacy:  In Azure, ownership and control of data is with the customer.  We have built Azure Databricks to adhere to these standards.  We aim for Azure Databricks to provide all the compliance certifications that the rest of Azure adheres to.
  • Flexibility in network topology: Customers have a diversity of network infrastructure needs.  Azure Databricks supports deployments in customer VNETs, which can control which sources and sinks can be accessed and how they are accessed.
  • Azure Storage and Azure Data Lake integration: these storage services are exposed to Databricks users via DBFS to provide caching and optimized analysis over existing data.
  • Azure Power BI: Users can connect Power BI directly to their Databricks clusters using JDBC in order to query data interactively at massive scale using familiar tools.
  • Azure Active Directory provide controls of access to resources and is already in use in most enterprises. Azure Databricks workspaces deploy in customer subscriptions so naturally AAD can be used to control access to sources, results and jobs.
  • Azure SQL Data Warehouse, Azure SQL DB and Azure CosmosDB: Azure Databricks easily and efficiently uploads results into these services for further analysis and real-time serving, making it simple to build end-to-end data architectures on Azure.
In addition to all the integration you can see, we have worked hard to integrate in ways that you can’t see – but can see the benefits of.



Internally, we use Azure Container Services to run the Azure Databricks control-plane and data-planes via containers.

  • Accelerated Networking provides the fastest virtualized network infrastructure in the cloud.   Azure Databricks utilizes this to further improve Spark performance.
  • The latest generation of Azure hardware (Dv3 VMs), with NvMe SSDs capable of blazing 100us latency on IO.  These make Databricks I/O performance even better.
  • We are just scratching the surface though!  As the service becomes GA and moves beyond that, we expect to add continued integrations with other upcoming Azure services.


Conclusion
Microsoft and Databricks are very excited to partner together to bring you Azure Databricks. For the first time, a leading cloud provider and leading analytics system provider have partnered to build a cloud analytics platform optimized from the ground up – from Azure’s storage and network infrastructure all the way to Databricks’s runtime for Apache Spark. We believe that Azure Databricks will greatly simplify building enterprise-grade production data applications, and we would love to hear your feedback as the service rolls out.



Azure Stream Analytics

More Information:

https://databricks.com/product/azure

https://databricks.com/product/unified-analytics-platform

https://databricks.com/product/azure#video

http://www.jamesserra.com/archive/2017/11/what-is-azure-databricks/

https://azure.microsoft.com/en-us/blog/azure-databricks-industry-leading-analytics-platform-powered-by-apache-spark/

https://www.dotnetcatch.com/2017/11/17/microsoft-connect-2017-keynote-highlights/

https://www.brighttalk.com/webcast/12891/309475/getting-started-with-apache-spark-on-azure-databricks

https://alwaysupalwayson.blogspot.nl/search/label/Azure

https://alwaysupalwayson.blogspot.nl/2018/02/smarthotel360-showcasing-modern-apps.html

https://pages.databricks.com/AzureDatabricks-ML-AI.html

https://pages.databricks.com/GettingStartedSpark-AzureDatabricks.html

https://www.brighttalk.com/webcast/12891/309475

https://www.brighttalk.com/webcast/12891/303471

https://databricks.com/blog/2017/11/15/a-technical-overview-of-azure-databricks.html







Introducing Windows Server 2019 – now available in preview

$
0
0



What’s new in Windows Server 2019


Windows Server 2019 is built on the strong foundation of Windows Server 2016 – which continues to see great momentum in customer adoption. Windows Server 2016 is the fastest adopted version of Windows Server, ever! We’ve been busy since its launch at Ignite 2016 drawing insights from your feedback and product telemetry to make this release even better.



We also spent a lot of time with customers to understand the future challenges and where the industry is going. Four themes were consistent – Hybrid, Security, Application Platform, and Hyper-converged infrastructure. We bring numerous innovations on these four themes in Windows Server 2019.

Windows Server 1709 – Everything you need to know in 10 minutes

Hybrid cloud scenarios:

We know that the move to the cloud is a journey and often, a hybrid approach, one that combines on-premises and cloud environments working together, is what makes sense to our customers. Extending Active Directory, synchronizing file servers, and backup in the cloud are just a few examples of what customers are already doing today to extend their datacenters to the public cloud. In addition, a hybrid approach also allows for apps running on-premises to take advantage of innovation in the cloud such as Artificial Intelligence and IoT. Hybrid cloud enables a future-proof, long-term approach – which is exactly why we see it playing a central role in cloud strategies for the foreseeable future.

At Ignite in September 2017, we announced the Technical Preview of Project Honolulu – our reimagined experience for management of Windows and Windows Server. Project Honolulu is a flexible, lightweight browser-based locally-deployed platform and a solution for management scenarios. One of our goals with Project Honolulu is to make it simpler and easier to connect existing deployments of Windows Server to Azure services. With Windows Server 2019 and Project Honolulu, customers will be able to easily integrate Azure services such as Azure Backup, Azure File Sync, disaster recovery, and much more so they will be able to leverage these Azure services without disrupting their applications and infrastructure.


Security:

Security continues to be a top priority for our customers. The number of cyber-security incidents continue to grow, and the impact of these incidents is escalating quickly. A Microsoft study shows that attackers take, on average, just 24-48 hours to penetrate an environment after infecting the first machine. In addition, attackers can stay in the penetrated environment – without being noticed – for up to 99 days on average, according to a report by FireEye/Mandiant. We continue on our journey to help our customers improve their security posture by working on features that bring together learnings from running global-scale datacenters for Microsoft Azure, Office 365, and several other online services.

Our approach to security is three-fold – Protect, Detect and Respond. We bring security features in all three areas in Windows Server 2019.
On the Protect front, we introduced Shielded VMs in Windows Server 2016, which was enthusiastically received by our customers. Shielded VMs protect virtual machines (VM) from compromised or malicious administrators in the fabric so only VM admins can access it on known, healthy, and attested guarded fabric. In Windows Server 2019, Shielded VMs will now support Linux VMs. We are also extending VMConnect to improve troubleshooting of Shielded VMs for Windows Server and Linux. We are adding Encrypted Networks that will let admins encrypt network segments, with a flip of a switch to protect the network layer between servers.

On the Detect and Respond front, in Windows Server 2019, we are embedding Windows Defender Advanced Threat Protection (ATP) that provides preventative protection, detects attacks and zero-day exploits among other capabilities, into the operating system. This gives customers access to deep kernel and memory sensors, improving performance and anti-tampering, and enabling response actions on server machines.

Application Platform:

A key guiding principle for us on the Windows Server team is a relentless focus on the developer experience. Two key aspects to call out for the developer community are improvements to Windows Server containers and Windows Subsystem on Linux (WSL).

Since the introduction of containers in Windows Server 2016, we have seen great momentum in its adoption. Tens of millions of container images have been downloaded from the Docker Hub. The team learned from feedback that a smaller container image size will significantly improve experience of developers and IT Pros who are modernizing their existing applications using containers. In Windows Server 2019, our goal is to reduce the Server Core base container image to a third of its current size of 5 GB. This will reduce download time of the image by 72%, further optimizing the development time and performance.

We are also continuing to improve the choices available when it comes to orchestrating Windows Server container deployments. Kubernetes support is currently in beta, and in Windows Server 2019, we are introducing significant improvements to compute, storage, and networking components of a Kubernetes cluster.

A feedback we constantly hear from developers is the complexity in navigating environments with Linux and Windows deployments. To address that, we previously extended Windows Subsystem on Linux (WSL) into insider builds for Windows Server, so that customers can run Linux containers side-by-side with Windows containers on a Windows Server. In Windows Server 2019, we are continuing on this journey to improve WSL, helping Linux users bring their scripts to Windows while using industry standards like OpenSSH, Curl & Tar.

Hyper-converged infrastructure (HCI): 

Hyper-converged infrastructure (HCI): HCI is one of the latest trends in the server industry today. According to IDC, the HCI market grew 64% in 2016 and Gartner says it will be a $5 billion market by 2019. This trend is primarily because customers understand the value of using x86 servers with high performant local disks to run their compute and storage needs at the same time. In addition, HCI gives the flexibility to easily scale such deployments.

Customers looking for HCI solutions can use Windows Server 2016 and the Windows Server Software Defined program today. We partnered with industry leading hardware vendors to provide an affordable and yet extremely robust HCI solution with validated design. In Windows Server 2019 we are building on this platform by adding scale, performance, and reliability. We are also adding the ability to manage HCI deployments in Project Honolulu, to simplify the management and day-to-day activities on HCI environments.

Finally, Window Server customers using System Center will be excited to know that System Center 2019 is coming and will support Windows Server 2019.

We have much more to share between now and the launch later this year. We will bring more details on the goodness of Windows Server 2019 in a blog series that will cover the areas above.

What’s new in Windows Server, version 1709 for the software-defined datacenter | BRK2278


Windows Server 2019 with no RDSH and Windows 10 Multi-user and even RDmi, where do we go?

The newest rumors and stories in my timeline suggested that the RDSH role is depleted in Windows Server 2019. Windows Server 2019 is a preview version just released. Some are installing it and they find that you can’t install the Remote Desktop Services role anymore. Together with stories about a Multi-user Windows 10 version, Microsoft working on RDmi, rumors come easily. My thoughts on this are captured in this blog, they are thoughts only so far, the truth is out there but not available for us right now. Perhaps my thoughts are far-fetched but it is what came to mind. There is an update already, I woven it into the article.

RDS(H)
Remote Desktop Services Host is a role of Remote desktop services. RDS is the backbone of a lot of virtual environments. Since the late 90s, we’ve seen Citrix and Microsoft progress their offering based on this. You can’t deploy Citrix XenApp,  VMware Horizon RDSH server or Microsoft RDSH without this role enabled. Many companies rely on this role. Multiple users could access applications or a desktop session on one server and work together without interfering with each other. It paved the way to a centralized desktop (before VDI came into play) with a reasonable TCO. One of the key benefits of this model was that data and application managed was centralized.

The downside of the solution always was the fact that resources are shared, applications are not always supported and features like store apps are not supported. The performance was a challenge for some use cases and that’s one of the reasons VDI was introduced, a single user desktop with non-shared resources (shared on a different level).

Windows Server 2019
Soon after Windows Server 2019 – Preview Release was available stories came out of the RDSH role missing. I saw several stories about trying to install the role but failing to do so. Of course, this is a preview so we have to see if the final version also has this limitation. If the role is not available, and why would the preview not have a default role like this, there be no reason for that. It seems that the RDSH role is to disappear and that customers will be offered other option, read on for the other options.

Sign on the wall
There are signs on the wall that times are a changing. Let’s take a look at the different suspects in this case (watching a detective while writing). Windows 10 Multi-user and RDmi are the ones that come to mind.

MVPDays - New & Cool Tools! Management with Project Honolulu - Mike Nelson



Windows 10 Multi-user
Microsoft Windows 10 will be having a multi-user version. So the initial thought was that they are transferring the RDS roles to Windows 10. It would make sense in a way that several features are easier implemented when running Windows 10. Features like access to Store apps, OneDrive on demand are accessible for Windows 10 users. That, however, is only true when you run a single user Windows 10 platform and will not have issues with a multi-user environment no matter the operating system. A Windows 10 Multi-user to replace an RDSH server to bring certain features seems far sought.

One reason I can think of is licensing. Server licenses are less expensive and transferring RDS to Windows 10 would force customers to acquire Windows 10 Desktop licenses with the CALs. For a lot of customers that would be a huge issue perhaps even getting them to think of moving to physical devices again. Microsoft announced that Windows Server 2019 might be more expensive and forcing people to RDS-VDI environments might hurt them more than they like to. Initially, I thought this was the reason for the missing role but perhaps there is more. This is still a valid option I think but one for the future when RDmi is a more common scenario.



RDmi
Another announcement of Microsoft is RDmi, Remote Desktop modern infra. Another initial thought is about Citrix XenApp essentials and RDmi but that’s another topic. One I work on from the 1st of April. Back to the topic.

RDmi is Remote Desktop Modern infra is the evolution in RDS and is offered as a .NET service running in Azure. The idea behind it is that all the roles you need to set up an RDS environment (given you want a Microsoft environment) are offered as a service. I won’t go deeper into RDmi right now, the intent of this article is not to explain RDmi. What I see from this offering is that Microsoft is moving RDS to Azure and enabling it to work with HTML5 clients as well. It enables more flexibility and disconnect some components from your network. There is far more to learn about this but the drawing and link below give a very good insight.

More info is found at https://cloudblogs.microsoft.com/enterprisemobility/2017/09/20/first-look-at-updates-coming-to-remote-desktop-services/

There will be a migration strategy offered for customers when it goes live. we have to wait a bit for more info. there are some blogs online already so do your “google” search.

Windows 10 Multi-user, RDmi or “old skool” RDSH, where do we go?
RDmi is a more interesting suspect, it brings modern features to RDS. It brings Azure into the picture and would offer customers a route to migrate to the new RDS offering without huge investments and testing. not every customer is keen on moving their workload to the Cloud so that might be why Windows 10 Multi-User mode is coming, although I wonder if customers are looking for that one.

I think, but that is just me, that Multi-user Windows 10s use case is different. Not sure yet what that use case is but not to massively replace RDSH. Migrating to Windows 10 would cost a lot of effort for customers, assuming they now run a server version for their desktop environment. The Windows 10 features would not be usable with multiple users working alongside each other.

Extending Windows Admin Center to manage your applications and infrastructure using modern browser-based technologies



So there are two offerings on the table and if you ask me I think there will be a campaign to move customers to RDmi. It won’t take away the burden of image management but will offer the roles as a service relieving IT admins from that management. We’ve seen similar offerings from Citrix and VMware, take the management burden away and let IT admins take care of the image only. Customs that can’t or won’t still run an on-premises environment presumably with Windows 10 in the future (1809). Microsoft is mapping the future and their idea of how you offer RDSH, as a service that is.

Because Microsoft has shifted to a more gradual upgrade of Windows Server, many of the features that will become available with Windows Server 2019 have already been in use in live corporate networks, and here are half a dozen of the best.

Enterprise-grade hyperconverged infrastructure (HCI)

With the release of Windows Server 2019, Microsoft rolls up three years of updates for its HCI platform. That’s because the gradual upgrade schedule Microsoft now uses includes what it calls Semi-Annual Channel releases – incremental upgrades as they become available. Then every couple of years it creates a major release called the Long-Term Servicing Channel (LTSC) version that includes the upgrades from the preceding Semi-Annual Channel releases.

Windows Admin Center



The LTSC Windows Server 2019 is due out this fall, and is now available to members of Microsoft’s Insider program.

While the fundamental components of HCI (compute, storage and networking) have been improved with the Semi-Annual Channel releases, for organizations building datacenters and high-scale software defined platforms, Windows Server 2019 is a significant release for the software-defined datacenter.

With the latest release, HCI is provided on top of a set of components that are bundled in with the server license. This means a backbone of servers running HyperV to enable dynamic increase or decrease of capacity for workloads without downtime. (For more on Microsoft HCI go here.)

GUI for Windows Server 2019

A surprise for many enterprises that started to roll-out the Semi-Annual Channel versins of Windows Server 2016 was the lack of a GUI for those releases.  The Semi-Annual Channel releases only supported ServerCore (and Nano) GUI-less configurations.  With the LTSC release of Windows Server 2019, IT Pros will once again get their desktop GUI of Windows Server in addition to the GUI-less ServerCore and Nano releases.



Project Honolulu

With the release of Windows Server 2019, Microsoft will formally release their Project Honolulu server management tool.



Project Honolulu is a central console that allows IT pros to easily manage GUI and GUI-less Windows 2019, 2016 and 2012R2 servers in their environments.

The evolution of Windows Server: Project Honolulu and what's new in 1709



Early adopters have found the simplicity of management that Project Honolulu provides by rolling up common tasks such as performance monitoring (PerfMon), server configuration and settings tasks, and the management of Windows Services that run on server systems.  This makes these tasks easier for administrators to manage on a mix of servers in their environment.

Updates to server management with the Windows Admin Center (formerly Honolulu) & PowerShell Core


Improvements in security

Microsoft has continued to include built-in security functionality to help organizations address an “expect breach” model of security management.  Rather than assuming firewalls along the perimeter of an enterprise will prevent any and all security compromises, Windows Server 2019 assumes servers and applications within the core of a datacenter have already been compromised.

Windows Server 2019 includes Windows Defender Advanced Threat Protection (ATP) that assess common vectors for security breaches, and automatically blocks and alerts about potential malicious attacks.  Users of Windows 10 have received many of the Windows Defender ATP features over the past few months. Including  Windows Defender ATP on Windows Server 2019 lets them take advantage of data storage, network transport and security-integrity components to prevent compromises on Windows Server 2019 systems.

The battle to increase security continues unabated and in this version we get Windows Defender ATP Exploit Guard, which is an umbrella for four new features: Network protection blocks outbound access from processes on the server to untrusted hosts/IP address based on Windows Defender SmartScreen information. Controlled folder access protects specified folders against untrusted process access such as ransomware whereas Exploit protection mitigates vulnerabilities in similar ways to what EMET used to do. Finally, Attack Surface Reduction (ASR) lets you set policies to block malicious files, scripts, lateral movement and so on.

Windows Defender Advanced Threat Protection (ATP) is now available for Windows Server, as well, and can integrate with your current deployment.

These measures will increase the security of your Hyper-V hosts but another feature (also first seen in a SAC release) applies directly to virtualization deployments: Encrypted Networks in SDN. A single click when you create a new virtual network in the SDN stack will ensure that all traffic on that network is encrypted, preventing eavesdropping. Note that this does not protect against malicious administrators but curiously, Microsoft has promised such protection in forthcoming versions, bringing the network protection in line with the host security Shielded Virtual Machines offer.

Smaller, more efficient containers

Organizations are rapidly minimizing the footprint and overhead of their IT operations and eliminating more bloated servers with thinner and more efficient containers. Windows Insiders have benefited by achieving higher density of compute to improve overall application operations with no additional expenditure in hardware server systems or expansion of hardware capacity.

Windows Server 2019 has a smaller, leaner ServerCore image that cuts virtual machine overhead by 50-80 percent.  When an organization can get the same (or more) functionality in a significantly smaller image, the organization is able to lower costs and improve efficiencies in IT investments.



There's a lot of focus on hybrid cloud in this preview, which makes sense, given Microsoft's assertion that most businesses will be in a hybrid state for a long time to come. The focus on containers continues with much smaller images available for both the server core and Nano server images.

But the coolest feature yet is the ability to run Linux containers on Windows Server. This first saw light in one of the SAC releases and it makes a lot of sense. Remember that in Windows (unlike Linux) we have two flavors of containers, Windows Containers and Hyper-V Containers. For a developer they work exactly the same and it's a deployment choice (develop on normal containers and deploy in production in Hyper-V containers). The Hyper-V flavor gives you the security isolation of a VM although they're much smaller than a "real" VM. So, the next logical step was running a different OS in the container, in this case Linux. Following a tutorial, I was able to get a Linux  container up and running quickly.

Windows subsystem on Linux

A decade ago, one would rarely say Microsoft and Linux in the same breath as complimentary platform services, but that has changed. Windows Server 2016 has open support for Linux instances as virtual machines, and the new Windows Server 2019 release makes huge headway by including an entire subsystem optimized for the operation of Linux systems on Windows Server.

The Windows Subsystem for Linux extends basic virtual machine operation of Linux systems on Windows Server, and provides a deeper layer of integration for networking, native filesystem storage and security controls. It can enable encrypted Linux virtual instances. That’s exactly how Microsoft provided Shielded VMs for Windows in Windows Server 2016, but now native Shielded VMs for Linux on Windows Server 2019.

Enterprises have found the optimization of containers along with the ability to natively support Linux on Windows Server hosts can decrease costs by eliminating the need for two or three infrastructure platforms, and instead running them on Windows Server 2019.

Because most of the “new features” in Windows Server 2019 have been included in updates over the past couple years, these features are not earth-shattering surprises.  However, it also means that the features in Windows Server 2019 that were part of Windows Server 2016 Semi-Annual Channel releases have been tried, tested, updated and proven already, so that when Windows Server 2019 ships, organizations don’t have to wait six to 12 months for a service pack of bug fixes.

Windows Admin Center

No discussion of the future of Windows Server is complete without mentioning the free, Web-based Windows Admin Center (WAC), formerly known as "Project Honolulu." It's going to be the GUI for managing Windows Server, including Hyper-V servers, clusters, Storage Spaces Direct and HCI clusters. It's got a lot of benefits over the current mix of Server Manager, Hyper-V Manager and Failover Cluster Manager (along with PowerShell) that we use today, including the simple fact that it's all in the one UI.

How to get started with Windows Admin Center


Updates to server management with the Windows Admin Center (formerly Honolulu) & PowerShell Core


Storage Replica & Migration

In Windows Server 2016 (Datacenter only) we finally got the missing puzzle piece in Microsoft's assault on SANs -- Storage Replica (SR). This directly competes with (very expensive) SAN replication technologies and lets you replicate from any volume on a single server or a cluster to another volume in another location (synchronously up to 150 km [90 miles for those of you in the United States]), asynchronously anywhere on the planet). This is useful for creating stretched Hyper-V clusters for very high resiliency or for Disaster Recovery (DR) in general.

In Windows Server 2019 Standard we're getting SR "Lite": a single volume per server (unlimited in Datacenter), a single partnership per volume (unlimited in Datacenter) and up to 2TB volumes (unlimited in Datacenters). These are the current limitations in the preview and voting is open to change this.

Hyper-V Replica is a different technology than SR. For instance, you could create a stretched Hyper-V cluster with SR as the transport mechanism for the underlying storage between the two locations and then use Hyper-V Replica for DR, replicating VMs to a third location or to Azure.

A totally new feature, Storage Migration Service is coming in Windows Server 2019. Intended to solve the problem of migrating from older versions of Windows Server to 2019 or Azure, it's not directly related to Hyper-V, although you can of course use it from within VMs or to migrate data to Azure Stack.

Data Deduplication is now available for Storage Spaces Direct (S2D) with the ReFS filesystem, so you could be looking at saving up to 50 percent of disk space. Speaking of S2D, Microsoft now supports Persistent Memory (aka Storage Class Memory) which is essentially battery-backed DDR memory sticks, leading to storage with incredibly low latency. Also new is performance history for S2D, where you can get a history of performance across drives, NICs, servers, VMs, vhd/vhdx files, volumes and the overall cluster. You can either use PowerShell or Windows Admin Center to access the data.

Failover Clustering

One of the biggest gripes I hear from cluster administrators is the difficulty of moving a cluster from one domain to another (mergers is a common cause of this); this is being addressed in 2019. Using just two PowerShell cmdlets you can remove the cluster name account from the original Active Directory domain, shut down the cluster functionality, unjoin from the source domain and add all nodes to a workgroup, then join them to the new domain and create new cluster resources in the destination AD domain. This definitely adds flexibility around Hyper-V clusters and their domain status.



Speaking of clusters, most businesses I speak to tend to keep the number of nodes in their clusters relatively low (six, eight, 12 and 16 nodes), even though the max number of nodes is 64, and instead have more clusters. Each of these clusters is totally separate but that's going to change in Windows Server 2019. You'll be able to group several clusters together (Hyper-V, Storage and even Hyper-Converged), with a Master cluster resource running on one cluster, coordinating with a Cluster Set Worker in each cluster. You'll be able to Live Migrate VMs from one cluster to another. I can see this being useful for scaling out Azure Stack (currently limited to 12 nodes) and for bringing the concept of the Software-Defined Datacenter (SDDC) closer to reality.

Another minor but potentially vital detail is using a file share witness stored in DFS. This isn't and has never been supported but not everyone reads the documentation. Imagine a six-node cluster with three nodes in a separate building with a file share witness as the tie breaker for the quorum. You could end up in a situation where the network connection between the two buildings is severed and the three nodes on one side keeps the cluster service (and thus the VMs) running because they can talk to the file share witness. But the other side has a DFS replicated copy of the same file share witness, so they, too, decide to keep the cluster service running (as they also have a majority of votes) and both sides could potentially be writing to back-end storage simultaneously, leading to serious data corruption. In Windows Server 2019 if you try to store a file share witness in DFS you'll get an error message and if it's added to DFS replication at some point in time later, it'll stop working.

You can also create a file share witness that doesn't use an AD account for scenarios where a DC isn't available (DMZ), or in a workgroup/cross-domain cluster.


Hyper-converged infrastructure (HCI): In Windows Server 2019, HCI will get scale, performance, and reliability. The team is also adding the ability to manage HCI deployments in Project Honolulu, to simplify the management and day-to-day activities on HCI environments.

Windows Server 2019 will be integrated with Project Honolulu, a browser-based management solution. Microsoft aims to make it easier for enterprises to connect their existing deployments of Windows Server to Azure services.

“With Windows Server 2019 and Project Honolulu, customers will be able to easily integrate Azure services such as Azure Backup, Azure File Sync, disaster recovery, and much more so they will be able to leverage these Azure services without disrupting their applications and infrastructure,” wrote Erin Chapple, Director of Program Management, Windows Server.

Microsoft is enhancing the security in Windows Server 2019, with a three-point approach: protect, detect and respond. The company has added Shielded VMs with support for Linux VMs as well. It will protect VMs against malicious activities. The addition of Encrypted Networks will enable encryption of network segments to protect network layer between servers.

Windows Server 2019 will have embedded Windows Defender Advanced Threat Protection (ATP) to detect attacks in the operating system. Sysadmins will have access to deep kernel and memory sensors, so that they can respond on server machines.

Under application platform, there will be improved orchestration for Windows Server container deployments. Windows Subsystem on Linux (WSL) support in new version will enable Linux users to bring their scripts to Windows while using industry standards like OpenSSH, Curl, and Tar. There is also a support of Kubernetes, which is currently in beta.

The Windows Server 2019 reduces the size of Server Core base container image from 5 GB to less than 2 GB. This will reduce the image download time by 72%, resulting in optimized development time and performance.

On Hyper-converged infrastructure (HCI) front, Microsoft said that it has added the ability in Windows Server 2019 to manage HCI deployments using Project Honolulu. It will make the management of several activities on HCI environments simpler.


This is a significant change that is helping organizations plan their adoption of Windows Server 2019 sooner than orgs may have adopted a major release platform in the past, and with significant improvements for enterprise datacenters in gaining the benefits of Windows Server 2019 to meet security, scalability, and optimized data center requirements so badly needed in today’s fast-paced environments.

Sign up for the Insiders program to access Windows Server 2019

We know you probably cannot wait to get your hands on the next release, and the good news is that the preview build is available today to Windows Insiders  https://insider.windows.com/en-us/for-business-getting-started-server/.

Join the program to ensure you have access to the bits. For more details on this preview build, check out the Release Notes.

We love hearing from you, so don’t forget to provide feedback using the Windows Feedback Hub app, or the Windows Server space in the Tech community.

Frequently asked questions

Q: When will Windows Server 2019 be generally available?

A: Windows Server 2019 will be generally available in the second half of calendar year 2018.

Q: Is Windows Server 2019 a Long-Term Servicing Channel (LTSC) release?

A: Windows Server 2019 will mark the next release in our Long-Term Servicing Channel. LTSC continues to be the recommended version of Windows Server for most of the infrastructure scenarios, including workloads like Microsoft SQL Server, Microsoft SharePoint, and Windows Server Software-defined solutions.

Q: What are the installation options available for Windows Server 2019?

A: As an LTSC release Windows Server 2019 provides the Server with Desktop Experience and Server Core installation options – in contrast to the Semi-Annual Channel that provides only the Server Core installation option and Nano Server as a container image. This will ensure application compatibility for existing workloads.

Q: Will there be a Semi-Annual Channel release at the same time as Windows Server 2019?

A: Yes. The Semi-Annual Channel release scheduled to go at the same time as Windows Server 2019 will bring container innovations and will follow the regular support lifecycle for Semi-Annual Channel releases – 18 months.

Q: Does Windows Server 2019 have the same licensing model as Windows Server 2016?

A: Yes. Check more information on how to license Windows Server 2016 today in the Windows Server Pricing page. It is highly likely we will increase pricing for Windows Server Client Access Licensing (CAL). We will provide more details when available.

More Information:

https://blogs.windows.com/windowsexperience/2018/05/15/announcing-windows-server-2019-insider-preview-build-17666/#OkRm0tD7feCTWXAF.97

https://www.techtask.com/whats-coming-sharepoint-server-2019-premises/

https://www.networkworld.com/article/3265052/data-center/top-6-features-in-windows-server-2019.html

https://www.microsoft.com/en-us/cloud-platform/windows-server

https://virtualizationreview.com/articles/2018/04/30/windows-server-2019-new-in-hyperv.aspx

http://news.thewindowsclub.com/windows-server-2019-features-92038/

https://docs.microsoft.com/en-us/windows-server/manage/windows-admin-center/understand/windows-admin-center

https://cloudblogs.microsoft.com/windowsserver/2018/04/12/announcing-windows-admin-center-our-reimagined-management-experience/

https://cloudblogs.microsoft.com/windowsserver/2018/03/20/introducing-windows-server-2019-now-available-in-preview/

https://robbeekmans.net/euc/windows-server-2019-with-no-rdsh-and-windows-10-multi-user-and-even-rdmi-where-do-we-go/

https://cloudblogs.microsoft.com/windowsserver/2018/03/29/windows-server-semi-annual-channel-update/

https://insider.windows.com/en-us/for-business-getting-started-server/


IBM Summit High Performance Computing: Accelerating Cognitive Workloads with Machine Learning

$
0
0


HPC and HPDA for the Cognitive Journey with OpenPOWER




The high-performance computing landscape is evolving at a furious pace that some are describing as an important inflection point, as Moore’s Law delivers diminishing returns while performance demands increase. Leaders of organizations are grappling with how to embrace recent system-level innovations like acceleration, while simultaneously being challenged to incorporate analytics into their HPC workloads.

Intro summit webinar: Innovative and Novel Computational Impact on Theory and Experiment (INCITE) Program for 2019



On the horizon, even more demanding applications built with machine learning and deep learning are emerging to push system demands to all-new highs. With all of this change in the pipeline, the usual tick-tock of minor code tweaks to accompany nominal hardware performance improvements can’t continue as usual. For many HPC organizations, significant decisions need to be made.

Introduction to ECP’s newest Focus Area, Hardware and Integration (HI)


Realizing that these demands could only be addressed by an open ecosystem, IBM partnered with other industry leaders Google, Mellanox, NVIDIA and others to form the OpenPOWER Foundation, dedicated to stewarding the Power CPU architecture into the next generation.

IBM Power9 Features and Specifications

A data-centric approach to HPC with OpenPOWER

In 2014, this disruptive approach to HPC innovation led to IBM being awarded two contracts to build the next generation of supercomputers as part of the US Department of Energy’s Collaboration of Oak Ridge, Argonne, and Lawrence Livermore, or CORAL program. In partnership with NVIDIA and Mellanox, we demonstrated to CORAL that a “data-centric” approach to systems – an architecture designed to embed compute power everywhere data resides in the system, positioning users for a convergence of analytics, modeling, visualization and simulation, which could lead to driving new insights at incredible speeds – could help them achieve their goals. Now, on the three-year anniversary of that agreement, we’re pleased to announce that we are delivering on our project, with our next-generation IBM Power Systems with NVIDIA Volta GPUs being deployed at Oak Ridge and Lawrence Livermore National Labs.


Moving mountains

Both systems, Summit at ORNL and Sierra at LLNL, are being installed as you read this, with completion expected early next year. Both systems are impressive. Summit is expected to increase individual application performance 5 to 10 times over Titan, Oak Ridge’s older supercomputer, and Sierra is expected to provide 4 to 6 times the sustained performance of Sequoia, Lawrence Livermore’s older supercomputer.

Summit Supercomputer


With Summit in place, Oak Ridge National Labs will advance their stated mission: “Be able to address, with greater complexity and higher fidelity, questions concerning who we are, our place on earth, and in our universe.” But most importantly, the clusters will position them to push the boundaries of one of the most important technological developments of our generation, artificial intelligence (AI).

IBM's world-class Summit supercomputer gooses speed with AI abilities


Built for AI, built for the future

However, emerging AI workloads are vastly different than traditional HPC workloads. The measurements of performance listed above, while interesting, do not really capture the performance requirements for deep learning algorithms. With AI workloads, bottlenecks shift away from compute and networking back to data movement at the CPU level. IBM POWER9 systems are specifically designed for these emerging challenges.

IBM Readies POWER9-based Systems for US Department of Energy CORAL Supercomputers at SC17


“We’re excited to see accelerating progress as the Oak Ridge National Laboratory Summit supercomputer continues to take shape. The infrastructure is now complete and we’re beginning to deploy the IBM POWER9 compute nodes.  We’re still targeting early 2018 for the final build-out of the Summit machine, which we expect will be among the world’s fastest supercomputers. The advanced capabilities of the IBM POWER9 CPUs coupled with the NVIDIA Volta GPUs will significantly advance the computational performance of DOE’s mission critical applications,” says Buddy Bland, Oak Ridge Leadership Computing Facility Director.

AI, The Next HPC Workload



POWER9 leverages PCIe Gen-4, next-generation NVIDIA NVLink interconnect technology, memory coherency and more features designed to maximize throughput for AI workloads. This should translate to more overall performance and larger scales while reducing space creep due to excessive node counts and potentially out-of-control power consumption. Projections from competitors show anticipated node counts exceeding 50,000 to break into exascale territory; but this is not until 2021. Already this year, IBM was able to leverage distributed deep learning to reduce model training time from 16 days to 7 hours by successfully scaling TensorFlow and Caffe across 256 NVIDIA Tesla GPUs. These new systems feature 100 times more GPUs spread across thousands of nodes, meaning the only theoretical limit to the deep learning benchmarks we can set with these new supercomputers is our own imaginations.


Start with the data



Data preparation for deep learningAll machine learning and deep learning models train on large amounts of data. Fortunately (and unfortunately), organizations are swimming in data sitting in structured and unstructured forms, and beyond the data they have under their control, organizations also have access to data for free or for a fee from a variety of sources.
Often, little of this data is in proper placement or forms for training a new AI model. To date, we have found that this has been a problem largely solved by manual methods: miles and miles of python scripting, often run inside spark clusters for speed of execution, along with a lot of orphan code.

Share Your Science: Accelerating Cognitive Workloads with Machine Learning


To help shorten transformation time, PowerAI Enterprise integrates a structured, template-based approach to building and transforming data sets. It starts with common output formats (LMDB, TensorFlowRecords, Images for Vector Output), and allows users to define the input format/structure of raw data and some of the key characteristics of what is needed in the transform step.

The data import tools in PowerAI Enterprise are aware of the size and the complexity of the data and the resources available to transform the data. For this reason, the integrated resource manager is able to intelligently manage the execution of the job: helping to optimize for either low cost (run across the fewest number of nodes/cores) or optimize for the fastest execution of the transform job (run across more nodes/cores).



Integrated into the data preparation step is a quality check function which is designed to allow a data engineer and a data scientist to check the clarity of the signal in the data, running a simplified model and sample training from within the data import tool. Although not as sophisticated as a fully-developed model, this “gut check” allows a data scientist to discover early on in the process whether there are obvious issues or deficiencies in the training data set before investing significant time in the model development phase.

Cognitive Computing: From Data to Analytics to Learning

The majority of data doesn't offer much value unless iteratively and progressively analyzed by the user and the system to produce powerful insights with recommended actions for the best outcome(s). In fact, IBM Watson (IBM’s leadership Cognitive system) constantly sifts through data, discovers insights, learns and determines the best course of action(s).

“The cognitive computing landscape continues to evolve rapidly; giving clients unique capabilities to progressively solve complex problems for higher value.”

Learning (Cognitive and Deep Machine Learning) interactive analytics systems that continuously build knowledge over time by processing natural language and data. These systems learn a domain by experience just as humans do and can discover and suggest the “best course of action”; providing highly time-critical valuable guidance to humans or just executing this “next best action”. IBM Watson is the premier cognitive system in the market.

Converging big data, AI, and BI with a GPU-accelerated database by Karthik Lalithraj


The underlying technologies for Deep Learning include Artificial Neural Networks (ANN)–neural networks inspired by and designed to mimic the function of the cortex, the thinking matter of the brain. Driverless autonomous cars, robotics and personalized medical therapies are some key disruptive innovations enabled by Deep Learning.
A performance-optimized infrastructure is critical for the Cognitive Computing journey.

Speed up the model development process

PowerAI Enterprise includes powerful model setup tools designed to address the earliest “dead end” training runs. Integrated hyperparameter optimization automates the process of characterizing new models by sampling from the training data set and instantiating multiple small training jobs across cluster resources (this means sometimes tens, sometimes thousands of jobs depending on the complexity of the model). The tool is designed to select the most promising combinations of hyperparameters to return to the data science teams. The outcome: fewer non-productive early runs and more time to focus on refining models for greater organizational value.

Once you have selected hyperparameters, you can begin bringing together all of the different elements a deep learning model training.

Lecture 15 | Efficient Methods and Hardware for Deep Learning


This next phase in the development process is extremely iterative. Even with assistance in selecting the correct hyperparameters, ensuring that your data is clean and has a clear signal within it, and that you were able to operate at the appropriate level of scale, chances are you will still be repeating training runs. By instrumenting the training process, PowerAI Enterprise can allow a data scientist to see feedback in real time on the training cycle.


PowerAI Enterprise provides the ability to visualize current progress and status of your training job, including iteration, loss, accuracy and histograms of weights, activations, and gradients of the neural network.
With this feedback, data scientists and model developers are alerted when the training process begins to go awry. These early warnings can allow data scientists and model developers to stop training runs that will eventually go nowhere and adjust parameters.

These workflow tools run on top of IBM’s scalable, distributed deep learning platforms. They take the best of open source frameworks and augment them for both large model support and better cluster performance, both of which open up the potential to take applied artificial intelligence into areas and use cases which were not previously feasible.



Bringing all these capabilities together accelerates development for data scientists, and the combination of automating workflow and extending the capabilities of open source frameworks unlocks the hidden value in organizational data.

As Gurinder Grewal, Senior Director, Architecture at PayPal said at IBM’s Think conference: “How do you take all these technologies and marry them together to build end to end platform that we can hand over to a data scientist and the business policy owners so they can extract most value out of the data?  I think that’s what excites us most about things your company is working on in terms of the PowerAI platform…  I think that’s one of the things we actually really appreciate the openness of the platform, extracting the most value out of the compute power we have and the power from the data.”

A foundation for data science as a service

At the core of the platform is an enterprise-class management software system for running compute- and data-intensive distributed applications on a scalable, shared infrastructure.

IBM PowerAI Enterprise supports multiple users and lines of business with multi-tenancy end-to-end security, including role-based access controls. Organizational leaders are looking to deploy AI infrastructure at scale. The combination of integrated security (including role-based access, encryption of workload and data), the ability to support service level agreements, and an extremely scalable resource orchestration designed for very large compute infrastructure, mean that it is now possible to share data science environments across the organization.

High Performance Computing and the Opportunity with Cognitive Technology


One customer which has successfully navigated the new world of AI is Wells Fargo. They use deep learning models to comply with a critical financial validation process.  Their data scientists build, enhance, and validate hundreds of models each day. Speed is critical, as well as scalability, as they deal with greater amounts of data and more complicated models. As Richard Liu, Quantitative Analytics manager at Wells Fargo said at IBM Think, “Academically, people talk about fancy algorithms. But in real life, how efficiently the models run in distributed environments is critical.” Wells Fargo uses the IBM AI Enterprise software platform for the speed and resource scheduling and management functionality it provides. “IBM is a very good partner and we are very pleased with their solution,” adds Liu.

Each part of the platform is designed to remove both time and pain from the process of developing a new applied artificial intelligence service.  By automating highly repetitive and manual steps, time is saved for improving and refining models, which can lead to a higher-quality result.

We’ve introduced a lot of new functionality in IBM PowerAI Enterprise 1.1, and I’ll be sharing more detail on these new capabilities in future posts.  I also welcome your input as we continue to add new capabilities moving forward.

The Sierra Supercomputer: Science and Technology on a Mission


More Information:

https://www.ibm.com/it-infrastructure/solutions/hpc

https://www.ibm.com/it-infrastructure/us-en/resources/power/big-data-hpc-hpda-analytics/

https://www.ibm.com/blogs/systems/reaching-the-summit-the-next-milestone-for-hpc/

https://www.technologyreview.com/s/611077/the-worlds-most-powerful-supercomputer-is-tailor-made-for-the-ai-era/

https://onlinexperiences.com/scripts/Server.nxp?LASCmd=AI:4;F:QS!10100&ShowKey=45975&LangLocaleID=1033&AffiliateData=Blog

https://www.ibm.com/it-infrastructure/us-en/resources/power/server-infrastructure-for-ai/












Capsule Neural Networks (CNN) a Better alternative for Convolutional Neural Networks (CNN)

$
0
0




Capsule Neural Networks (CNN) a Better alternative


Geoffrey Hinton and his team published two papers that introduced a completely new type of neural network based on so-called capsules. In addition to that, the team published an algorithm, called dynamic routing between capsules, that allows to train such a network.

Introduction to Capsule Networks (CapsNets)


For everyone in the deep learning community, this is huge news, and for several reasons. First of all, Hinton is one of the founders of deep learning and an inventor of numerous models and algorithms that are widely used today. Secondly, these papers introduce something completely new, and this is very exciting because it will most likely stimulate additional wave of research and very cool applications.


Capsule Neural Networks


What is a CapsNet or Capsule Network?

Introduction to How Faster R-CNN, Fast R-CNN and R-CNN Works


Faster R-CNN Architecture

How RPN (Region Proposal Networks) Works


What is a Capsule Network? What is a Capsule? Is CapsNet better than a Convolutional Neural Network (CNN)? In this article I will talk about all the above questions about CapsNet or Capsule Network released by Hinton.
Note: This article is not about pharmaceutical capsules. It is about Capsules in Neural Networks or Machine Learning world.
There is an expectation from you as a reader. You need to be aware of CNNs. If not, I would like you to go through this article on Hackernoon. Next I will run through a small recap of relevant points of CNN. That way you can easily grab on to the comparison done below. So without further ado lets dive in.

CNN are essentially a system where we stack a lot of neurons together. These networks have been proven to be exceptionally great at handling image classification problems. It would be hard to have a neural network map out all the pixels of an image since it‘s computationally really expensive. So convolutional is a method which helps you simplify the computation to a great extent without losing the essence of the data. Convolution is basically a lot of matrix multiplication and summation of those results.

Capsule Networks Are Shaking up AI — An Introduction


After an image is fed to the network, a set of kernels or filters scan it and perform the convolution operation. This leads to creation of feature maps inside the network. These features next pass via activation layer and pooling layers in succession and then based on the number of layers in the network this continues. Activation layers are required to induce a sense of non linearity in the network (eg: ReLU). Pooling (eg: max pooling) helps in reducing the training time. The idea of pooling is that it creates “summaries” of each sub-region. It also gives you a little bit of positional and translational invariance in object detection. At the end of the network it will pass via a classifier like softmax classifier which will give us a class. Training happens based on back propagation of error matched against some labelled data. Non linearity also helps in solving the vanishing gradient in this step.

What is the problem with CNNs?

CNNs perform exceptionally great when they are classifying images which are very close to the data set. If the images have rotation, tilt or any other different orientation then CNNs have poor performance. This problem was solved by adding different variations of the same image during training. In CNN each layer understands an image at a much more granular level. Lets understand this with an example. If you are trying to classify ships and horses. The innermost layer or the 1st layer understands the small curves and edges. The 2nd layer might understand the straight lines or the smaller shapes, like the mast of a ship or the curvature of the entire tail. Higher up layers start understanding more complex shapes like the entire tail or the ship hull. Final layers try to see a more holistic picture like the entire ship or the entire horse. We use pooling after each layer to make it compute in reasonable time frames. But in essence it also loses out the positional data.

Pooling helps in creating the positional invariance. Otherwise CNNs would fit only for images or data which are very close to the training set. This invariance also leads to triggering false positive for images which have the components of a ship but not in the correct order. So the system can trigger the right to match with the left in the above image. You as an observer clearly see the difference. The pooling layer also adds this sort of invariance.

Depthwise Separable Convolution - A FASTER CONVOLUTION!


This was never the intention of pooling layer. What the pooling was supposed to do is to introduce positional, orientational, proportional invariances. But the method we use to get this uses is very crude. In reality it adds all sorts of positional invariance. Thus leading to the dilemma of detecting right ship in image 2.0 as a correct ship. What we needed was not invariance but equivariance. Invariance makes a CNN tolerant to small changes in the viewpoint. Equivariance makes a CNN understand the rotation or proportion change and adapt itself accordingly so that the spatial positioning inside an image is not lost. A ship will still be a smaller ship but the CNN will reduce its size to detect that. This leads us to the recent advancement of Capsule Networks.


Hinton himself stated that the fact that max pooling is working so well is a big mistake and a disaster:

Hinton: “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.”

Of course, you can do away with max pooling and still get good results with traditional CNNs, but they still do not solve the key problem:

Internal data representation of a convolutional neural network does not take into account important spatial hierarchies between simple and complex objects.

In the example of a Dog, a mere presence of 2 eyes, a mouth and a nose in a picture does not mean there is a face, we also need to know how these objects are oriented relative to each other.

What is a Capsule Network?

Every few days there is an advancement in the field of Neural Networks. Some brilliant minds are working on this field. You can pretty much assume every paper on this topic is almost ground breaking or path changing. Sara Sabour, Nicholas Frost and Geoffrey Hinton released a paper titled “Dynamic Routing Between Capsules” 4 days back. Now when one of the Godfathers of Deep Learning “Geoffrey Hinton” is releasing a paper it is bound to be ground breaking. The entire Deep Learning community is going crazy on this paper as you read this article. So this paper talks about Capsules, CapsNet and a run on MNIST. MNIST is a database of tagged handwritten digit images. Results are showing a significant increase in performance in case of overlapped digits. The paper compares to the current state-of-the-art CNNs. In this paper the authors project that human brain have modules called “capsules”. These capsules are particularly good at handling different types of visual stimulus and encoding things like pose (position, size, orientation), deformation, velocity, albedo, hue, texture etc. The brain must have a mechanism for “routing” low level visual information to what it believes is the best capsule for handling it.

Capsule Networks



Capsule is a nested set of neural layers. So in a regular neural network you keep on adding more layers. In CapsNet you would add more layers inside a single layer. Or in other words nest a neural layer inside another. The state of the neurons inside a capsule capture the above properties of one entity inside an image. A capsule outputs a vector to represent the existence of the entity. The orientation of the vector represents the properties of the entity. The vector is sent to all possible parents in the neural network. For each possible parent a capsule can find a prediction vector. Prediction vector is calculated based on multiplying it’s own weight and a weight matrix. Whichever parent has the largest scalar prediction vector product, increases the capsule bond. Rest of the parents decrease their bond. This routing by agreement method is superior than the current mechanism like max-pooling. Max pooling routes based on the strongest feature detected in the lower layer. Apart from dynamic routing, CapsNet talks about adding squashing to a capsule. Squashing is a non-linearity. So instead of adding squashing to each layer like how you do in CNN, you add the squashing to a nested set of layers. So the squashing function gets applied to the vector output of each capsule.

Why Deep Learning Works: Self Regularization in Deep Neural Networks



The paper introduces a new squashing function. You can see it in image 3.1. ReLU or similar non linearity functions work well with single neurons. But the paper found that this squashing function works best with capsules. This tries to squash the length of output vector of a capsule. It squashes to 0 if it is a small vector and tries to limit the output vector to 1 if the vector is long. The dynamic routing adds some extra computation cost. But it definitely gives added advantage.

Now we need to realise that this paper is almost brand new and the concept of capsules is not throughly tested. It works on MNIST data but it still needs to be proven against much larger dataset across a variety of classes. There are already (within 4 days) updates on this paper who raise the following concerns:
1. It uses the length of the pose vector to represent the probability that the entity represented by a capsule is present. To keep the length less than 1 requires an unprincipled non-linearity that prevents there from being any sensible objective function that is minimized by the iterative routing procedure.
2. It uses the cosine of the angle between two pose vectors to measure their agreement for routing. Unlike the log variance of a Gaussian cluster, the cosine is not good at distinguishing between quite good agreement and very good agreement.
3. It uses a vector of length n rather than a matrix with n elements to represent a pose, so its transformation matrices have n 2 parameters rather than just n.

The current implementation of capsules has scope for improvement. But we should also keep in mind that the Hinton paper in the first place only says:

The aim of this paper is not to explore this whole space but to simply show that one fairly straightforward implementation works well and that dynamic routing helps.


Capsule Neural Networks: The Next Neural Networks?  CNNs and their problems.
Convolutional (‘regular’) Neural Networks are the latest hype in machine learning, but they have their flaws. Capsule Neural Networks are the recent development from Hinton which help us solve some of these issues.

Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

Capsule Networks: An Improvement to Convolutional Networks


Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

 At late October 2017, Geoffrey Hinton, Sara Sabour, and Nicholas Frosst Published a research paper under Google Brain named “Dynamic Routing Between Capsules”, introducing a true innovation to Neural Networks. This is exciting, since such development has been long awaited for, will likely spur much more research and progress around it, and is supposed to make neural networks even better than they are now.

Capsule networks: overview

The Baseline: Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are extremely flexible machine learning models which were originally inspired by principles from how our brains are theorized to work.
Neural Networks utilize layers of “neurons” to process raw data into patterns and objects.
The primary building blocks of a Neural Network is a “Convolutional” layer (hence the name). What does it do? It takes raw information from a previous layer, makes sense of patterns in it, and send it onward to the next layer to make sense of a larger picture.

 If you are new to neural networks and want to understand it, I recommend:

  • Watching the animated videos by 3Blue1Brown.
  • For a more detailed textual/visual guide, you can check out this beginner’s blogpost
  • If you can deal with some more math and greater details, you can read instead this guide from CS231 at Stanford. 


 In case you didn’t do any of the above, and plan to continue, here is a hand-wavy brief overview.

The Intuition Behind Convolutional Neural Networks

Let’s start from the beginning.
 The Neural Net receives raw input data. Let’s say it’s a doodle of a dog. When you see a dog, you brain automatically detects it’s a dog. But to the computer, the image is really just an array of numbers representing the colors intensity in the colors channels. Let’s say it’s just a Black&White doodle, so we can represent it with one array where each cell represents the brightness of the pixel from black to white.

Understanding Convolutional Neural Networks.

Convolutional Layers. The first convolutional layer maps the image space to a lower space — summarizing what’s happening in each group of, say 5x5 pixels — is it a vertical line? horizontal line? curve of what shape? This happens with element wise multiplication and then summation of all the values in the filter with the original filter value and summing up to a single number.

This leads to the Neuron, or convolutional filters. Each Filter / Neuron is designed to react to one specific form (a vertical line? a horizontal line? etc…). The groups of pixels from layer 1 reach these neurons, and lights up the neurons that match its structure according to how much this slice is similar to what the neuron looks for.

Activation (usually “ReLU”) Layers — After each convolutional layer, we apply a nonlinear layer (or activation layer), which introduces non-linearity to the system, enabling it to discover also nonlinear relations in the data. ReLU is a very simple one: making any negative input to 0, or if it’s positive — keeping it the same. ReLU(x) = max(0,x).

Pooling Layers. This allows to reduce “unnecessary” information, summarize what we know about a region, and continue to refine information. For example, this might be “MaxPooling” where the computer will just take the highest value of the passed this patch — so that the computer knows “around these 5x5 pixels, the most dominant value is 255. I don’t know exactly in which pixel but the exact location isn’t as important as that it’s around there. → Notice: This is not good. We loose information here. Capsule Networks don’t have this operation here, which is an improvement.

Dropout Layers. This layer “drops out” a random set of activations in that layer by setting them to zero. This makes the network more robust (kind of like you eating dirt builds up your immunity system, the network is more immune to small changes) and reduces overfitting. This is only used when training the network.

Last Fully Connected Layer. For a classification problem, we want each final neuron represents the final class. It looks at the output of the previous layer (which as we remember should represent the activation maps of high level features) and determines which features most correlate to a particular class.

SoftMax — This layer is sometimes added as a another way to represent the outputs per classes that we can later pass on in a loss function. Softmax represents the distribution of probabilities to the various categories.
Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

Capsule Network



Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

 Now, importantly, these layers are connected only SEQUENTIALLY. This is in contrast to the structure of capsule networks.


What is The Problem With Convolutional Neural Networks?

If this interests you, watch Hinton's lecture explaining exactly what it wrong with them. Below you'll get a couple of key points that are improved by Capsule Networks.

Hinton says that they have too few levels of substructures (nets are composed from layers composed from neurons, that's it); and that we need to group the neurons in each layer into “capsules”, like mini-columns, that do a lot of internal computations, and then output a summary result.

Problems with CNNs and Introduction to capsule neural networks

Problem #1: Pooling looses information

CNN use “pooling” or equivalent methods to “summarize” what's going on in the smaller regions and make sense of larger and larger chunks of the image. This was a solution that made CNNs work well, but it looses valuable information.

 Capsule networks will compute a pose (transnational and rotational) relationship between smaller features to make up a larger feature.
 This loss of information leads to loss of spatial information.

Problem #2: CNNs don't account for the spatial relations between the parts of the image. Therefore, they also are too sensitive to orientation.

Subsampling (and pooling) loses the precise spatial relationships between higher-level parts like a nose and a mouth. The precise spatial relationships are needed for identity recognition.

(Hinton, 2012, in his lecture).

Geoffrey Hinton Capsule theory



 CNNs don't account for spatial relationships between the underlying objects. By having these flat layers of neurons that light up according to which objects they've seen, they recognize the presence of such objects. But then they are passed on to other activation and pooling layers and on to the next layer of neurons (filters), without recognizing what are the relations between these objects we identified in that single layer.
 They just account for their presence.

Hinton: Dynamic Routing Between Capsules


So a (simplistic) Neural network will not hesitate about categorizing both these dogs, Pablo and Picasso, as similarly good representations of “corgi-pit-bull-terrier mix”.

Capsule Networks (CapsNets) – Tutorial

Problem #3: CNNs can't transfer their understanding of geometric relationships to new viewpoints.

This makes them more sensitive to the original image itself in order to classify images as the same category.

 CNNs are great for solving problems with data similar to what they have been trained on. It can classify images or objects within them which are very close to things it has seen before.

 But if the object is slightly rotated, photographed from a slightly different angle, especially in 3D, is tilted or in another orientation than what the CNN has seen - the network won't recognize it well.

 One solution is to artificially create tilted representation of the image or groups and add them to the “training” set. However, this still lacks a fundamentally more robust structure.


What is a Capsule?

Capsule Networks Explained in detail ! (Deep learning)



In order to answer this question, I think it is a good idea to refer to the first paper where capsules were introduced — “Transforming Autoencoders” by Hinton et al. The part that is important to understanding of capsules is provided below:

“Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artificial neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. When the capsule is working properly, the probability of the visual entity being present is locally invariant — it does not change as the entity moves over the manifold of possible appearances within the limited domain covered by the capsule. The instantiation parameters, however, are “equivariant” — as the viewing conditions change and the entity moves over the appearance manifold, the instantiation parameters change by a corresponding amount because they are representing the intrinsic coordinates of the entity on the appearance manifold.”

The paragraph above is very dense, and it took me a while to figure out what it means, sentence by sentence. Below is my version of the above paragraph, as I understand it:

Artificial neurons output a single scalar. In addition, CNNs use convolutional layers that, for each kernel, replicate that same kernel’s weights across the entire input volume and then output a 2D matrix, where each number is the output of that kernel’s convolution with a portion of the input volume. So we can look at that 2D matrix as output of replicated feature detector. Then all kernel’s 2D matrices are stacked on top of each other to produce output of a convolutional layer.

Then, we try to achieve viewpoint invariance in the activities of neurons. We do this by the means of max pooling that consecutively looks at regions in the above described 2D matrix and selects the largest number in each region. As result, we get what we wanted — invariance of activities. Invariance means that by changing the input a little, the output still stays the same. And activity is just the output signal of a neuron. In other words, when in the input image we shift the object that we want to detect by a little bit, networks activities (outputs of neurons) will not change because of max pooling and the network will still detect the object.

Dynamic routing between capsules


The above described mechanism is not very good, because max pooling loses valuable information and also does not encode relative spatial relationships between features. We should use capsules instead, because they will encapsulate all important information about the state of the features they are detecting in a form of a vector (as opposed to a scalar that a neuron outputs).

[PR12] Capsule Networks - Jaejun Yoo



Capsules encapsulate all important information about the state of the feature they are detecting in vector form.

Capsules encode probability of detection of a feature as the length of their output vector. And the state of the detected feature is encoded as the direction in which that vector points to (“instantiation parameters”). So when detected feature moves around the image or its state somehow changes, the probability still stays the same (length of vector does not change), but its orientation changes.

t.1 – Capsules and routing techniques (part 1/2)



Imagine that a capsule detects a face in the image and outputs a 3D vector of length 0.99. Then we start moving the face across the image. The vector will rotate in its space, representing the changing state of the detected face, but its length will remain fixed, because the capsule is still sure it has detected a face. This is what Hinton refers to as activities equivariance: neuronal activities will change when an object “moves over the manifold of possible appearances” in the picture. At the same time, the probabilities of detection remain constant, which is the form of invariance that we should aim at, and not the type offered by CNNs with max pooling.

PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks


More Information:

“Understanding Dynamic Routing between Capsules (Capsule Networks)”
https://jhui.github.io/2017/11/03/Dynamic-Routing-Between-Capsules/


Understanding Hinton’s Capsule Networks. Part I: Intuition.   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b



Understanding Capsule Networks — AI’s Alluring New Architecture.  https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc



What is a CapsNet or Capsule Network?   https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc


https://en.wikipedia.org/wiki/Capsule_neural_network


https://en.wikipedia.org/wiki/Convolutional_neural_network


A “weird” introduction to Deep Learning.  https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0


Faster R-CNN Explained   https://medium.com/@smallfishbigsea/faster-r-cnn-explained-864d4fb7e3f8


A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN   https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4


Convolutional Neural Network (CNN)   https://skymind.ai/wiki/convolutional-network


Capsule Neural Networks: The Next Neural Networks? Part 1: CNNs and their problems  https://towardsdatascience.com/capsule-neural-networks-are-here-to-finally-recognize-spatial-relationships-693b7c99b12


Understanding Hinton’s Capsule Networks. Part II: How Capsules Work    https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66


Understanding Hinton’s Capsule Networks. Part III: Dynamic Routing Between Capsules   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dynamic-routing-between-capsules-349f6d30418


Understanding Hinton’s Capsule Networks. Part IV: CapsNet Architecture   https://medium.com/@pechyonkin/part-iv-capsnet-architecture-6a64422f7dce



Oracle Database 18.3.0 on premises is Released in July

$
0
0




Oracle Database 18.3.0 on premises


Today we have a guest blogger, Dominic Giles, Master Product Manager from Oracle Database providing us with insights into what to expect from Oracle Database 18c.

Oracle Database 18c


Oracle Database 18c's arrival marks a change in the way the world’s most popular database is released. It brings new functionality and improvements to features already available in Oracle Database 12c. In this blog, I'll highlight what you can expect from this new release and where you can get additional information but first let me address the new release model that the Database team has adopted.

Release schedule




Oracle Database 18c is the first version of the product to follow a yearly release pattern. From here onwards the Oracle Database will be released every year along with quarterly updates. You can find more details on this change by visiting Oracle Support and taking a look at the support Document 2285040.1 or on Mike Dietrich’s blog. If you’re confused as to why we’ve apparently skipped 6 releases of Oracle it may be simpler to regard “Oracle Database 18c” as “Oracle Database 12c Release 2 12.2.0.2”, where we’ve simply changed the naming to reflect the year in which the product is released.

Into the Future with Oracle Autonomous Database


We believe the move to a yearly release model and the simplification of the patching process will result in a product that introduces new smaller changes more frequently without the potential issues that a monolithic update brings.

New Release and Patching Model for Oracle Database

 

Building on a strong foundation

Oracle Database 18c, as I mentioned earlier, is the next iteration of Oracle Database 12c Release 2 and as a result, it has a lot of incremental enhancements aimed to improve upon this important release. With that in mind, let’s remind ourselves what was in Oracle Database 12c Release 2.

Oracle Autonomous Data Warehouse Cloud Demo



The release itself focused on 3 major areas:

Multitenant is Oracle’s strategic container architecture for the Oracle Database. It introduced the concept of a pluggable database (PDB) enabling users to plug and unplug their databases and move them to other containers either locally or in the cloud. The architecture enables massive consolidation and the ability to manage/patch/backup many databases as one. We introduced this architecture in Oracle Database 12c and extended it capabilities in Oracle Database 12c Release 2 with the ability to hot clone, online relocate and provide resource controls for IO, CPU and Memory on a PDB basis. We also ensured that all of the features available in a non-container are available for a PDB (Flashback Database, Continuous Query etc.).

Database In-Memory enables users to perform lightning fast analytics against their operational databases without being forced to acquire new hardware or make compromises in the way they process their data. The Oracle Database enables users to do this by adopting a dual in-memory model where OLTP data is held both as rows, enabling it to be efficiently updated, and in a columnar form enabling it to be scanned and aggregated much faster. This columnar in-memory format then leverages compression and software in silicon to analyze billions of rows a second, meaning reports that used to take hours can now be executed in seconds. In Oracle Database 12c Release 2 we introduced many new performance enhancements and extended this capability with new features that enabled us to perform in In-Memory analytics on JSON documents as well as significantly improving the speed at which the In-Memory column store is available to run queries after at startup.

Oracle Database Sharding, released in Oracle Database 12c Release 2, provides OLTP scalability and fault isolation for users that want to scale outside of the usual confines of a typical SMP server. It also supports use cases where data needs to be placed in geographic location because of performance or regulatory reasons. Oracle Sharding provides superior run-time performance and simpler life-cycle management compared to home-grown deployments that use a similar approach to scalability. Users can automatically scale up the shards to reflect increases in workload making Oracle one of the most capable and flexible approaches to web scale workloads for the enterprise today.

Oracle Database 18c (autonomous database)


Oracle 12c Release 2 also included over 600 new features ranging from syntax improvements to features like improved Index Compression, Real Time Materialized views, Index Usage Statistics, Improved JSON support, Enhancements to Real Application Clusters and many many more. I’d strongly recommend taking a look at the “New Features guide for Oracle Database 12c Release 2” available here below.

Oracle Autonomous Data Warehouse – How It Works



Incremental improvements across the board

As you’d expect from a yearly release Oracle Database 18c doesn’t contain any seismic changes in functionality but there are lots of incremental improvements. These range from syntax enhancements to improvements in performance, some will require that you explicitly enable them whilst others will happen out of the box. Whilst I’m not going to be able to cover all of the many enhancements in detail I’ll do my best to give you a flavor of some of these changes. To do this I’ll break the improvements into 6 main areas : Performance, High Availability, Multitenant, Security, Data Warehousing and Development.

Oracle database in cloud, dr in cloud and overview of oracle database 18c

Performance

For users of Exadata and Real Application Clusters (RAC), Oracle Database 18c brings changes that will enable a significant reduction in the amount of undo that needs to be transferred across the interconnect. It achieves this my using RDMA, over the Infiniband connection, to access the undo blocks in the remote instance. This feature combined with a local commit cache significantly improves the throughput of some OLTP workloads when running on top of RAC. This combined with all of the performance optimization that Exadata brings to the table, cements its position as the highest performance Database Engineered System for both OLTP and Data Warehouse Workloads.

To support applications that fetch data primarily via a single unique key Oracle Database 18c provides a memory optimized lookup capability. Users simply need to allocate a portion of Oracle’s Memory (SGA) and identify which tables they want to benefit from this functionality, the database takes care of the rest. SQL fetches are significantly faster as they bypass the SQL layer and utilize an in-memory hash index to reduce the number or operations that need to be performed to get the row. For some classes of application this functionality can result in upwards of 4 times increase in throughput with a halving of their response times.

Oracle PL/SQL 12c and 18c New Features + RADstack + Community Sites


To ease the maintenance work for In-Memory it’s also now possible to have tables and partitions automatically populated into and aged out of the column store. It does this by utilizing the Heat Map such that when the Column Store is under memory pressure it evicts inactive segments if more frequently accessed segments would benefit from population.

Oracle Database In-Memory gets a number of improvements as well. It now uses parallel light weight threads to scan its compression units rather than a process driven serial scans. This is available for both serial and parallel scans of data and it can double the speed at which data is read. This improves the already exceptional scan performance of Oracle Database In-Memory. Alongside this feature, Oracle Database In-Memory also enables Oracle Number types to be held in their native binary representation (int, float etc). This enables the data to be processed by the vector processing units on processors like Intel’s Xenon CPU much faster than previously. For some aggregation and arithmetic operations this can result in a staggering 40 times improvement in performance.

Finally, In-Memory in Oracle Database 18c also allows you to place data from external tables in the column store, enabling you to execute high performance analytics on data outside of the database.

High Availability

Whether you are using Oracle Real Application Clusters or Oracle DataGuard we continue to look for ways to improve on the Oracle Database’s high availability functionality. With Oracle Database 18c we’re rolling out a few significant upgrades.

Oracle Real Application Clusters also gets a hybrid sharding model. With this technology you can enjoy all of the benefits that a shared disk architecture provides whilst leverage some of the benefits that Sharding offers. The Oracle Database will affinitize table partitions/shards to nodes in the cluster and route connections using the Oracle Database Sharding API based on a shard key. The benefit of this approach is that it formalizes a technique often taken by application developers to improve buffer cache utilization and reduce the number of cross shard pings between instances. It also has the advantage of removing the punitive cost of cross shard queries simply by leveraging RAC’s shared disk architecture.

Sharding also gets some improvements in Oracle Database 18c in the form of “User Defined Sharding” and “Swim Lanes”. Users can now specify how shards are to be defined using either the system managed approach, “Hashing”, or by using an explicit user defined model of “Range” and “List” sharding. Using either of these last two approaches gives users the ability to ensure that data is placed in a location appropriate for its access. This might be to reduce the latency between the application and the database or to simply ensure that data is placed in a specific data center to conform to geographical or regulatory requirements. Sharded swim lanes also makes it possible to route requests through sharded application servers all the way to a sharded Oracle Database. Users do this by having their routing layer call a simple REST API. The real benefit of this approach is that it can improve throughput and reduce latency whilst minimizing the number of possible connections the Oracle Database needs to manage.

For the users of Java in the Database we’re rolling out a welcome fix that will make it possible to perform rolling patching of the database.

Multitenant

Multitenant in Oracle Database 18c got a number of updates to continue to round out the overall architecture.  We’re introducing the concept of a Snapshot Carousel. This enables you to define regular snapshots of PDBs. You can then use these snapshots as a source for PDB clones from various points of time, rather than simply the most current one. The Snapshot Carousel might be ideal for a development environment or to augment a non-mission critical backup and recovery process.

I’m regularly asked if we support Multitenant container to container active/active Data Guard Standbys. This is where some of the primary PDBs in one container have standby PDBs in an opposing container and vice versa. We continue to move in that direction and in Oracle Database 18c we move a step closer with the introduction of “Refreshable PDB Switchover”. This enables users to create a PDB which is an incrementally updated copy of a “master” PDB. Users may then perform a planned switchover between the PDBs inside of the container. When this happens the master PDB becomes the clone and the old clone the master. It’s important to point out that this feature is not using Data Guard; rather it extends the incremental cloning functionality we introduced in Oracle Database 12c Release 2.

In Oracle Database 18c Multitenant also got some Data Guard Improvements. You can now automatically maintain standby databases when you clone a PDB on the primary. This operation will ensure that the PDB including all of its data files are created on the standby database. This significantly simplifies the process needed to provide disaster recovery for PDBs when running inside of a container database. We also have made it possible to clone a PDB from a Active Data Guard Standby. This feature dramatically simplifies the work needed to provide copies of production databases for development environments.

Multitenant also got a number of small improvements that are still worth mentioning. We now support the use of backups performed on a PDB prior to it being unplugged and plugged into a new container. You can also expect upgrades to be quicker under Multitenant in Oracle Database 18c.

Security

The Oracle Database is widely regarded as the most secure database in the industry and we continue to innovate in this space. In Oracle Database 18c we have added a number or small but important updates. A simple change that could have a big impact for the security of some databases is the introduction of schema only accounts. This functionality allows schemas to act as the owners of objects but not allow clients to log in potentially reducing the attack surface of the database.

Database Security 18c New Features


To improve the isolation of Pluggable Databases (PDBs) we are adding the ability for each PDB to have its own key store rather than having one for the entire container. This also simplifies the configuration of non-container databases by introducing explicit parameters and hence removing the requirement to edit the sqlnet.ora file

DB Security; Secure your Data


A welcome change for some Microsoft users is the integration of the Oracle Database with Active Directory. Oracle Database 18c allows Active Directory to authenticate and authorize users directly without the need to also use Oracle Internet Directory. In the future we hope to extend this functionality to include other third-party LDAP version 3–compliant directory services. This change significantly reduces the complexity needed to perform this task and as a result improves the overall security and availability of this critical component.

Data Warehousing

Oracle Database 18c’s support for data warehousing got a number of welcome improvements.

Whilst machine learning has gotten a lot of attention in the press and social media recently it’s important to remind ourselves that the Oracle Database has had a number of these algorithms since Oracle 9i.  So, in this release we’ve improved upon our existing capability by implementing some of them directly inside of the database without the need for callouts, as well as added some more.



One of the compromises that data warehouse users have had to accept in the past was that if they wanted to use a standby database, they couldn’t use no-logging to rapidly load data into their tables. In Oracle Database 18c that no longer has to be the case. Users can make a choice between two modes whilst accommodating the loading of non-logged data. The first ensures that standbys receive non-logged data changes with minimum impact on loading speed at the primary but at the cost of allowing the standby to have transient non-logged blocks. These non-logged blocks are automatically resolved by managed standby recovery. And the the second ensures all standbys have the data when the primary load commits but at the cost of throttling the speed of loading data at the primary, which means the standbys never have any non-logged blocks.

Using Oracle Database as a Document Store


One of the most interesting developments in Oracle Database 18c is the introduction of Polymorphic table functions. Table functions are a popular feature that enables a developer to encapsulate potentially complicate data transformations, aggregations, security rules etc. inside of a function that when selected from returns the data as if it was coming from a physical table. For very complicated ETL operations these table functions can be pipelined and even executed in parallel. The only downside of this approach was that you had to declare the shape of the data returned as part of the definition of the function i.e. the columns to be returned. With Polymorphic tables, the shape of the data to be returned is determined by the parameters passed to the table function. This provides the ability for polymorphic table functions to be more generic in nature at the cost of a little more code.



One of my personal favorite features of this release is the ability to merge partitions online. This is particularly useful if you partition your data by some unit of time e.g. minutes, hours, days weeks and at some stage as the data is less frequently updated you aggregate some of the partitions into larger partitions to simplify administration. This was possible in previous versions of the of the database, but the table was inaccessible whilst this took place. In Oracle Database 18c you merge your partitions online and maintain the indexes as well. This rounds out a whole list of online table and partition operations that we introduced in Oracle Database 12c Release 1 and Release 2 e.g. move table online, split partition online, convert table to partition online etc.

Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data


For some classes of queries getting a relatively accurate approximate answer fast is more useful than getting an exact answer slowly. In Oracle Database 12c we introduced the function APPROX_COUNT_DISTINCT which was typically 97% or greater but can provide the result orders of magnitudes faster. We added additional functions in Oracle Database 12c Release 2 and in 18c we provide some additional aggregation (on group) operations APPROX_COUNT(), APPROX_SUM() and APPROX_RANK().

Oracle Spatial and Graph also added some improvements in this release. We added support for Graphs in Oracle Database 12c Release 2. And now in Oracle Database 18c you can use Property Graph Query Language (PGL) to simplify the querying of the data held within them. Performance was also boosted with the introduction of support for Oracle In Memory and List Hash partitioning.

We also added a little bit of syntax sugar when using external tables. You can now specify the external table definition inline on an insert statement. So no need to create definitions that are used once and then dropped anymore.

Development

As you’d expect there were a number of Oracle Database 18c improvements for developers, but we are also updating to our tools and APIs.

JSON is rapidly becoming the preferred format for application developers to transfer data between the application tiers. In Oracle Database 12c we introduced support that enabled JSON to be persisted to the Oracle Database and queried using dot notation. This gave developers a no compromise platform for JSON persistence with the power and industry leading analytics of the Oracle Database. Developers could also treat the Oracle Database as if it was a NoSQL Database using the Simple Oracle Document Access (SODA) API. This meant that whilst some developers could work using REST or JAVA NoSQL APIs to build applications, others could build out analytical reports using SQL. In Oracle Database 18c we’ve also added a new SODA API for C and PL/SQL and included a number of improvements to functions to return or manipulate JSON in the database via SQL. We’ve also enhanced the support for Oracle Sharding and JSON.

Developer day v2


Global Temporary Tables are an excellent way to hold transient data used in reporting or batch jobs within the Oracle Database. However, their shape, determined by their columns, is persisted across all sessions in the database. In Oracle Database 18c we’ve provide a more flexible approach with Private Temporary Tables. These allow uses to define the shape of the table that is only visible for a given session or even just a transaction. This approach provides more flexibility in the way developers write code and can ultimately lead to better code maintenance.



Oracle Application Express, Oracle SQL Developer, Oracle SCLCl, ORDS have all been tested with 18c and in some instance get small bumps in functionality such as support for Sharding.

We also plan to release an REST API for the Oracle Database. This will ship with ORDS 18.1 a little later this year.

And One Other Thing…

We’re also introducing a new mode for Connection Manager. If you’re not familiar with what Connection Manager (CMAN) does today, I’d recommend taking a look here. Basically, CMAN allows you to use it as a connection concentrator enabling you to funnel thousands of sessions into a single Oracle Database. With the new mode introduced in Oracle Database 18c, it’s able to do a lot more. It can now automatically route connections to surviving database resources in the advent of some outage. It can also redirect connections transparently if you re-locate a PDB. It can load-balance connections across databases and PDBs whilst also transparently enabling connection performance enhancements such as statement caching and pre-fetching. And it can now significantly improve the security of incoming connections to the database.

Oracle OpenWorld Major Announcements


All in all, an exciting improvement to a great networking resource for the Oracle Database.


Oracle Database Release 18c New Features Complete List:  

https://docs.oracle.com/en/database/oracle/oracle-database/18/newft/new-features.html#GUID-04A4834D-848F-44D5-8C34-36237D40F194
https://docs.oracle.com/en/cloud/get-started/subscriptions-cloud/whnew/index.html#WHNEW-GUID-751B2C30-99F0-4C21-A01E-FEFB35918BD8
https://community.oracle.com/community/support/support-blogs/database-support-blog/blog/2018/07/23/oracle-database-18c-now-available-for-on-premises
http://white-paper.b-lay.com/oracle-released-oracle-database-18c-whats-new#!/index


Oracle Database 18.3.0 is available on Linux since July 23, 2018. And I wanted to quickly sneak into the Oracle Database 18.3.0 installation on premises. I did blog about the Oracle 18c installation a few weeks ago but this was a plain 18.1.0. This time I install the 18.3.0 on-prem edition for Linux.


Download Location:  

https://mikedietrichde.com/2018/07/23/oracle-database-18-3-0-on-premises-available-for-download-on-linux/

Upgrade process:

https://mikedietrichde.com/2018/06/20/installing-oracle-database-18c/
https://mikedietrichde.com/2018/07/27/oracle-18-3-0-on-premises-includes-1-4gb-patches/


Oracle Database 18.3.0 installation on premises


Are the any differences between an 18.1.0 and the 18.3.0 installation? No, there aren’t any (at least not anything I recognized). The most important thing: you must unzip the downloaded file into your future destination directory.

In my case I unzip:

mkdir /u01/app/oracle/product/18
cd /u01/app/oracle/product/18
unzip /media/sf_TEAM/LINUX.X64_180000_db_home.zip
Then call the install script:

./runInstaller

This short video demonstrates the installation process:




And finally run root.sh:

su root
passwd: oracle

cd /u01/app/oracle/product/18
./root.sh
exit
That’s it.

One addition: If you wonder about the new environment variables ORACLE_BASE_HOME and ORACLE_BASE_CONFIG: Those are used to the Read-Only Home feature since Oracle 18c. Find more information in MOS Note:2409465.1 (Oracle 18c – Configuring Read Only OracleHome / DBCA / Patching / Upgrade).


More Information:

https://www.geeksforgeeks.org/dbms/

https://mikedietrichde.com/2018/07/25/oracle-database-18-3-0-installation-on-premises/

https://blogs.oracle.com/database/oracle-database-18c-:-now-available-on-the-oracle-cloud-and-oracle-engineered-systems

https://k21academy.com/oracle-dba/dbas-cloud-dba-oracle-database-18c-18-3-0-on-premise-is-now-available/

https://www.dataintensity.com/customer_hub/

https://go.oracle.com/LP=71504?elqCampaignId=156136&src1=:so:tw:or::TWEMEA&SC=:so:tw:or::TWEMEA&pcode=NAMK180510P00025

http://stevenfeuersteinonplsql.blogspot.com/2018/08/code-you-should-never-see-in-plsql.html

https://commonf18.sched.com/event/Fo5L/power-pick-advanced-sql-data-manipulation-language-double-session

https://commonf18.sched.com/event/FoIF/power-pick-embedded-sql-an-introduction

https://members.common.org/CommonSite/Events/Event_Display.aspx?EventKey=F18&_zs=QOFoe1&_zl=s6H55

https://oracle-base.com/blog/2018/02/17/oracle-database-18c-released-how-to-get-started/

https://sqlmaria.com/2018/02/16/oracle-database-18c-released/

https://www.oracle.com/database/technologies/index.html

https://www.b-lay.com/articles/2018/06/14/oracle-database-18c-whats-new/

https://docs.oracle.com/en/database/oracle/oracle-database/18/newft/new-features.html

http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html

http://www.oracle.com/technetwork/developer-tools/rest-data-services/overview/index.html

http://www.oracle.com/technetwork/developer-tools/sqlcl/overview/sqlcl-index-2994757.html

http://www.oracle.com/technetwork/developer-tools/apex/overview/index.html

If you’d like to try out Oracle Database 18c you can do it here with LiveSQL

https://livesql.oracle.com/apex/livesql/file/index.html

https://docs.oracle.com/en/database/oracle/oracle-database/18/index.html


For More information on when Oracle Database 18c will be available on other platforms please refer to Oracle Support Document 742060.1



Installing Oracle Database 18c (18.1.0)
Upgrading to Oracle Database 18.3.0 on-prem (will be available on July 26, 2018)
Why does the Oracle 18.3.0 on premises include 1.4GB patches? (will be available on July 27, 2018)

Kubernetes for the Enterprise!

$
0
0


Announcing SUSE CaaS Platform 3

Containers for Big Data: How MapR Expands Containers Use to Access Data Directly


Every enterprise needs Kubernetes today, including yours.  But with the platform evolving so rapidly, it can be difficult to keep up.  Not to worry, SUSE can take care of that for you: SUSE CaaS Platform delivers Kubernetes advancements to you in an enterprise-grade solution.

SUSE and Big Data


SUSE today announced SUSE CaaS Platform 3, introducing support for a raft of new features and a special focus on the Kubernetes platform operator.  You can read all about it in the press release, but let’s hit on a few of the highlights here.  With SUSE CaaS Platform 3 you can:

Optimize your cluster configuration with expanded datacenter integration and cluster re-configuration options
Setting up your Kubernetes environment is easier than ever with improved integration of private (OpenStack) and public (Amazon Web Services, Microsoft Azure, and Google Cloud Platform) cloud storage, and automatic deployment of the Kubernetes software load balancer.

Persistent Storage for Docker Containers | Whiteboard Walkthrough


A new SUSE toolchain module also allows you to tune the MicroOS container operating system to support your custom hardware configuration needs. Now you can, for example, install additional packages required to run your own monitoring agent or other custom software.
Transform your start-up cluster into a highly available environment. With new cluster reconfiguration capabilities, you can switch from a single-master to a multi-master environment, or vice-versa, to accommodate your changing needs.



Manage container images more efficiently and securely with a local container registry
Download a container image from an external registry once, then save a copy in your own local registry for sharing among all nodes in your cluster. By connecting to an internal proxy rather than an external registry, and by downloading from a local cache rather than a remote server, you’ll improve security and increase performance every time a cluster node pulls an image from the local registry.
For still greater security, disconnect from external registries altogether and use only trusted images you’ve loaded into your local registry.



Try out the new, lightweight CRI-O container runtime, designed specifically for Kubernetes, and introduced in CaaSP 3 as a tech preview feature. Stable and secure, CRI-O is also smaller and architecturally simpler than traditional container runtimes.

Simplify deployment and management of long running workloads through the Apps Workloads API. Promoted to ‘stable’ in upstream Kubernetes 1.9 code, the Apps Workloads API is now supported by SUSE.  This API generally facilitates orchestration (self-healing, scaling, updates, termination) of common types of workloads.

Modern Big Data Pipelines over Kubernetes [I] - Eliran Bivas, Iguazio


With Kubernetes now a must-have for every enterprise, you’ll want to give SUSE CaaS Platform a serious look.  Focused on providing an exceptional platform operator experience, it delivers Kubernetes innovations in a complete, enterprise grade solution that enables IT to deliver the power of Kubernetes to users more quickly, consistently, and efficiently.

SUSE CaaS Platform also serves as the Kubernetes foundation for SUSE Cloud Application Platform, which addresses modern application developers’ needs by bringing the industry’s most respected cloud-native developer experience (Cloud Foundry) into a Kubernetes environment.

SUSE CaaS Platform

KUBERNETES, READY FOR THE ENTERPRISE
SUSE CaaS Platform is an enterprise class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. It includes Kubernetes to automate lifecycle management of modern applications, and surrounding technologies that enrich Kubernetes and make the platform itself easy to operate. As a result, enterprises that use SUSE CaaS Platform can reduce application delivery cycle times and improve business agility.

SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK


SUSE is focused on delivering an exceptional operator experience with SUSE CaaS Platform.


HDFS on Kubernetes—Lessons Learned - Kimoon Kim


With deep competencies in infrastructure, systems, process integration, platform security, lifecycle management and enterprise-grade support, SUSE aims to ensure IT operations teams can deliver the power of Kubernetes to their users quickly, securely and efficiently. With SUSE CaaS Platform you can:

Achieve faster time to value with an enterprise-ready container management platform, built from industry leading technologies, and delivered as a complete package, with everything you need to quickly offer container services.

Simplify management and control of your container platform with efficient installation, easy scaling, and update automation.

Maximize return on your investment, with a flexible container services solution for today and tomorrow.

Episode 3: Kubernetes and Big Data Services


Key Features

A Cloud Native Computing Foundation (CNCF) certified Kubernetes distribution, SUSE CaaS Platform automates the orchestration and management of your containerized applications and services with powerful Kubernetes capabilities, including:


  • Workload scheduling places containers according to their needs while improving resource utilization
  • Service discovery and load balancing provides an IP address for your service, and distributes load behind the scenes
  • Application scaling up and down, accommodates changing load
  • Non-disruptive Rollout/Rollback of new applications and updates enables frequent change without downtime
  • Health monitoring and management supports application self-healing and ensures application availability



In addition, SUSE CaaS Platform simplifies the platform operator’s experience, with everything you need to get up and running quickly, and to manage the environment effectively in production. It provides:

  • Application ecosystem support with SUSE Linux container base images, and access to tools and services offered by SUSE Ready for CaaS Platform partners and the Kubernetes community
  • Enhanced datacenter integration features that enable you to plug Kubernetes into new or existing infrastructure, systems, and processes
  • A complete container execution environment, including a purpose-built container host operating system, container runtime, and container image registries
  • End-to-End security, implemented holistically across the full stack
  • Advanced platform management that simplifies platform installation, configuration, re-configuration, monitoring, maintenance, updates, and recovery
  • Enterprise hardening including comprehensive interoperability testing, support for thousands of platforms, and world-class platform maintenance and technical support

Cisco and SUSE have collaborated for years on solutions that improve efficiencies and lower costs in the data center by leveraging the flexibility and value of the UCS platform and the performance and reliability of the SUSE Linux Enterprise Server.

With focus and advancement in the areas of compute, storage and networking, Cisco and SUSE are now looking to help organizations tackle the challenges associated with the ‘5 Vs’ of Big Data:

1. Volume
2. Variety
3. Velocity
4. Veracity (of data)
5. Value

Ian Chard of Cisco recently published a great read untangling these challenges, and pointing to areas that help harness the power of data analytics.
Article content Below:

The harnessing of data through analytics is key to staying competitive and relevant in the age of connected computing and the data economy.

Analytics now combines statistics, artificial intelligence, machine learning, deep learning and data processing in order to extract valuable information and insights from the data flowing through your business.

Unlock the Power of Kubernetes for Big Data by Joey Zwicker, Pachyderm



Your ability to harness analytics defines how well you know your business, your customers, and your partners – and how quickly you understand them.

But it’s still hard to gain valuable insights from data. Collectively the challenges are known as the ‘5 Vs of big data’:

The Volume of data has grown so much that traditional relational database management software running on monolithic servers is incapable of processing it.

The Variety of data has also increased. There are many more sources of data and many more different types.

Velocity describes how fast the data is coming in. It has to be processed, often in real time, and stored in huge volume.

Veracity of data refers to how much you can trust it. Traditional structured data (i.e. in fixed fields or formats) goes through a validation process. This approach does not work with unstructured (i.e. raw) data.

Deriving Value from the data is hard due to the above.

If you’re wrestling with the 5 Vs, chances are you’ll be heading to ExCeL London for the annual Strata Data Conference on 22-24 May 2018.

We’ll be there on Booth 316, together with our partners including SUSE, where we’ll be showcasing how much of the progress made in compute, storage, and networking, as well as distributed data processing frameworks can help to address these challenges.

1) The Infrastructure evolution
Compute demands are growing in direct response to data growth. More powerful servers or, more servers working in parallel – aka scale-out – are needed.

Deep learning techniques for example can absorb an insatiable amount of data, making a robust HDFS cluster a great way to achieve scale out storage for the collection and preparation of the data. Machine learning algorithms can run on traditional x86 CPUs, but GPUs can accelerate these algorithms by up to a factor of 100.

New approaches to data analytics applications and storage are also needed because the majority of the data available is unstructured. Email, text documents, images, audio, and video are data types that are a poor fit for relational databases and traditional storage methods.

Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop


Storing data in the public cloud can ease the load. But as your data grows and you need to access it more frequently, cloud services can become expensive, while the sovereignty of that data can be a concern.

Software-defined storage is a server virtualisation technology that allows you to shift large amounts of unstructured data to cost-effective, flexible solutions located on-premises. This assures performance and data sovereignty while reducing storage costs over time.

You can use platforms such as Hadoop to create shared repositories of unstructured data known as data lakes. Running on a cluster of servers, data lakes can be accessed by all users. However, they must be managed in a way that’s compliant, using enterprise-class data management platforms that allow you to store, protect and access data quickly and easily.

2) Need for speed
The pace of data analytics innovation continues to increase. Previously, you would define your data structures and build an application to operate on the data. The lifetime of such applications was measured in years.

Today, raw data is collected and explored for meaningful patterns using applications that are rebuilt when new patterns emerge. The lifetime of these applications is measured in months – and even days.

The value of data can also be short-lived. There’s a need to analyse it at source, as it arrives, in real time. Data solutions that employ in-memory processing for example, give your users immediate, drill-down access to all the data across your enterprise applications, data warehouses and data lakes.

3) Come see us at Strata Data Conference
Ultimately, your ability to innovate at speed with security and governance assured comes down to your IT infrastructure.

Cisco UCS is a trusted computing platform proven to deliver lower TCO and optimum performance and capacity for data-intensive workloads.

85% of Fortune 500 companies and more than 60,000 organisations globally rely on our validated solutions. These combine our servers with software from a broad ecosystem of partners to simplify the task of pooling IT resources and storing data across systems.

Modern big data and machine learning in the era of cloud, docker and kubernetes


Crucially, they come with role- and policy-based management, which means you can configure hundreds of storage servers as easily as you can configure one, making scale-out a breeze as your data analytics projects mature.

If you’re looking to transform your business and turn your data into insights faster, there’s plenty of reasons to come visit us on booth 316:

4) Accelerated Analytics
If your data lake is deep and your data scientists are struggling to making sense of what lies beneath, then our MapD demo powered by data from mobile masts will show you how to cut through the depths and find the enlightenment you seek fast.

5) Deep learning with Cloudera Data Science Workbench
For those with a Hadoop cluster to manage their data lakes and deep learning framework, we’ll be demonstrating how to accelerate the training of deep learning modules with Cisco UCS C240 and C480 servers equipped with 2 and 6 GPUs respectively. We’ll also show you how to support growing cluster sizes using cloud-managed service profiles rather than more manpower.

6) Get with the Cisco Gateway
If you’re already a customer and fancy winning some shiny new tech, why not step through the Gateway to grow your reputation as a thought leader and showcase the success you’ve had?

7) Find your digital twin
To effectively create a digital twin of the enterprise, data scientists have to incorporate data sources inside and outside of the data centre for a holistic 360-view. Come join our resident expert Han Yang for his session on how we’re benefiting from big data and analytics, as well as helping our customers to incorporate data sources from Internet of Things and deploy machine learning at the edge and at the enterprise.

8) Get the scoop with SUSE
We’re set to unveil a new integration of SUSE Linux Enterprise Server and Cisco UCS. There’ll be SUSE specialists on our booth, so you can be the first to find out more about what’s in the pipeline.

What is Kubernetes?

Kubernetes is an open source system for automatically orchestrating and managing containerized applications.

AI meets Big Data



Designing applications using open source Linux containers is an ideal approach for building cloud-native applications for hosting in private, public or hybrid clouds. Kubernetes automates the deployment, management and scaling of these containerized applications, making the whole process easier, faster and more efficient.

Businesses of all types are looking for a new paradigm to drive faster innovation and agility. This is changing forever how applications are architected, deployed, scaled and managed to deliver new levels of innovation and agility. Kubernetes has become widely embraced by almost everyone interested in dramatically accelerating application delivery with containerized and cloud-native workloads.

Kubernetes is now seen as the outright market leader by software developers, operations teams, DevOps professionals and IT business decision makers.

Manage Microservices & Fast Data Systems on One Platform w/ DC/OS

Kubernetes Heritage

Kubernetes was originally the brainchild of Google. Google has been building and managing container-based applications and cloud-native workloads in production and at scale for well over a decade. Kubernetes emerged from the knowledge and experience gained with earlier Google container management systems called Borg and Omega.

Extending DevOps to Big Data Applications with Kubernetes


Now an open source project, Kubernetes is under the stewardship of the Cloud Native Computing Foundation (CNCF) and The Linux Foundation. This ensures that the project benefits from the best ideas and practices from a huge open source community and makes sure the danger of vendor lock-in is avoided.

Key Features:

  • Deploy applications rapidly and predictably to private, public or hybrid clouds
  • Scale applications non-disruptively
  • Roll out new features seamlessly
  • Lean and efficient use of computing resources
  • Keep production applications up and running with self-healing capabilities

SUSE and Kubernetes

SUSE believes Kubernetes will be a key element of the application delivery solutions needed to drive the enterprise business of the future.

Big data and Kubernetes



Here is a selection of SUSE products built using Kubernetes:

SUSE Cloud Application Platform brings advanced Cloud Foundry productivity to modern Kubernetes infrastructure, helping software development and operations teams to streamline lifecycle management of traditional and new cloud-native applications. Building on
SUSE CaaS Platform, SUSE Cloud Application Platform adds a unique Kubernetes-based implementation of Cloud Foundry, introducing a powerful DevOps workflow into a Kubernetes environment. Built on enterprise-grade Linux and with full Cloud Foundry and Kubernetes certification, it is an outstanding platform to support the entire development lifecycle for traditional and new cloud-native applications.

SUSE OpenStack Cloud makes it easy to spin up Kubernetes clusters in a full multi-tenant environment, allowing different users to have their own Kubernetes cluster. Customers can use either the built-in support for OpenStack Magnum or leverage SUSE CaaS Platform, which gives the added benefits of ready-to-run images, templates and heat automation. With these Kubernetes-as-a-Service capabilities, it’s no wonder that OpenStack users are reported to be adopting containers 3 times faster than the rest of the enterprise market.


SUSE CaaS Platform is a certified Kubernetes software distribution. It provides an enterprise-class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. Using SUSE CaaS Platform, enterprises can reduce application delivery cycle times and improve business agility.

What's the Hadoop-la about Kubernetes?

THE DIFFERENCE BETWEEN BIG DATA AND FAST DATA

Big data, long an industry buzzword, is now commonplace among most businesses. A 2014 survey from Gartner found 73 percent of organizations had already invested or planned to invest in big data by 2016. For many companies, the question now is not how to manage and harness data, but how to do it even more effectively. The next frontier for big data is to master speed. If you can’t analyze big data in real time, you lose much of the value of the information passing through databases.

What is fast data?

Fast Data with Apache Ignite and Apache Spark - Christos Erotocritou


While big data refers to the massive fire hose of information generated each hour, fast data refers to data that provides real-time insights. In many industries, especially the payment industry, making quick analyses of information is crucial to the bottom line. For example, fast data could prevent a massive breach that would release sensitive customer information. In this case, analyzing data in real time is far more important than storing it in massive quantities. When it comes to ecommerce fraud, the insights happening in the moment matter the most.

Kubernetes vs Docker Swarm | Container Orchestration War | Kubernetes Training | Edureka


As a Wired article put it, where in the past, gaining insights from big data was like finding a needle in a haystack, fast data is like finding the needle as soon as it’s dropped.

Fast data for payments

“For payment systems, decisions must be made in the sub-second range,” Richard Harris, head of international operations at Feedzai, recently told Payment Cards and Mobile. “Our clients typically require 20-50 millisecond response times. So we’ve overcome this by using technology founded in the Big Data era, such as Hadoop and Cassandra.”

Apache Spark on Kubernetes - Anirudh Ramanathan & Tim Chen


Payment processor First Data and Feedzai has teamed up to use machine learning to fight fraud. Feedzai monitors the company’s STAR Network, which enables debit payments for First Data’s clients.

Todd Clark, Senior Vice President and Head of STAR Network and Debit Processing at First Data explained “The combination of Feedzai’s machine learning software and First Data’s experience, [Clark says], has made the STAR Network capable of scoring over 3,000 transactions per second.”

 “This big speed and accuracy advantage means the STAR network is less of an attractive target for fraud,” Harris said.

Infrastructure challenges

Not all systems are set up to handle fast data. Without the right tools to manage the data flow quickly, valuable insights are lost or gained too late to be of use. While many existing platforms can handle and store large quantities of data, most fall behind when it comes to analyzing the information in real time. To begin with, organizations need to move beyond systems that only allow batch processing, according to Wired. In this case, companies need to tell computers to analyze large batches of information, which it processes one at a time – similar to the way credit card bills are processed at the end of each month.

With most companies now set up to gain insights from big data, the next step is to enable real-time insights. In the payment world, this means catching potential fraud as it’s happening, not waiting until it has already happened.



Beyond Hadoop: The Rise of Fast Data




Over the past two to three years, companies have started transitioning from big data, where analytics are processed after-the-fact in batch mode, to fast data, where data analysis is done in real-time to provide immediate insights. For example, in the past, retail stores such as Macy’s analyzed historical purchases by store to determine which products to add to stores in the next year. In comparison, Amazon drives personalized recommendations based on hundreds of individual characteristics about you, including what products you viewed in the last five minutes.

Containerized Hadoop beyond Kubernetes



Big data is collected from many sources in real-time, but is processed after collection in batches to provide information about the past. The benefits of data are lost if real-time streaming data is dumped into a database because of the inability to act on data as it is collected.

Super Fast Real-time Data Processing on Cloud-Native Architecture [I] - Yaron Haviv, iguazio


Modern applications need to respond to events happening now, to provide insights in real time. To do this they use fast data, which is processed as it is collected to provide real-time insights. Whereas big data provided insights into user segmentation and seasonal trending using descriptive (what happened) and predictive analytics (what will likely happen), fast data allows for real-time recommendations and alerting using prescriptive analytics (what should you do about it).

Big help for your first big data project

It’s clear. Today, big data is changing the way companies work. What hasn’t been clear is how companies should go about implementing big data projects.

Until now.

Our highly practical workbook is full of advice about big data that’ll help you keep your project on track. From setting clear goals to strategic resourcing and ideal big data architectures, we’ve covered everything you need to know about big data.

Streaming Big Data with Heron on Kubernetes Cluster


Read “The Big, Big Data Workbook” to gain insights into:

  • How to choose the right project and set up the right goals
  • How to build the right team and maximize productivity
  • What your data governance framework should look like
  • The architecture and processes you should aim to build
  • “The Big, Big Data Workbook” is a comprehensive guide about the practical aspects of big data and an absolute must-read if you’re attempting to bring greater insights to your enterprise.

More Information:

https://mesosphere.com/blog/fast-data-new-big-data/

https://www.suse.com/partners/big-data/

https://www.suse.com/partners/big-data/partners/

https://www.suse.com/c/the-cisco-and-suse-strata-for-big-data-solutions/

https://mapr.com/partners/partner/suse/

https://en.opensuse.org/Events:bigdata

https://en.opensuse.org/Portal:Wiki

https://www.suse.com/solutions/kubernetes/

https://www.suse.com/products/caas-platform/

https://www.brighttalk.com/webcast/11477/266605

https://www.brighttalk.com/webcast/11477/266601

https://www.i-scoop.eu/big-data-action-value-context/

https://gblogs.cisco.com/uki/8-reasons-why-you-need-fluid-tech-for-data-analytics-success/?doing_wp_cron=1526651829.1563229560852050781250

https://now.informatica.com/en_the-big-big-data-workbook_book_2730_ppc_nl.html#fbid=im02iU8HQVv


Windows Server 2019 (Version: 10.0.17763) and SQL server 2019

$
0
0



Windows Server 2019 (Version: 10.0.17763) and SQL server 2019

The Latest from Ignite 2018

From Ops to DevOps with Windows Server containers and Windows Server 2019


Windows Server 2019 will be generally available in October and we have updated Windows Admin Center, version 1809,  to support Windows Server 2019 and Azure hybrid scenarios. Windows Server 2019 builds on the foundation of Windows Server 2016, the fastest adopted version of Windows Server with 10s of millions of instances deployed worldwide. Customers like Alaska Airlines, Tyco, and Tieto have adopted Windows Server 2016 to modernize their datacenters.


What's new in Remote Desktop Services on Windows Server 2019



Through various listening channels such as the Insider program, product telemetry analysis, and industry trends, we heard loud and clear that hybrid, security, agility, and TCO are top of mind for our customers. Datacenter modernization is critical to support your business and deliver innovation, especially given the competitive landscape today. Windows Server 2019 is designed and engineered to help modernize your datacenter, delivering on four key areas:

Hybrid: The move to the cloud is a journey. A hybrid approach, one that combines on-premises and cloud environments working together, is a core element of our customers’ modernization strategy. This is why hybrid is built in to Windows Server 2019 and Windows Admin Center. To make it easier to connect existing Windows Server deployments to Azure services, we built interfaces for hybrid capabilities into the Windows Admin Center. With Windows Admin Center and Windows Server 2019, customers can use hybrid features like Azure Backup, Azure File Sync, disaster recovery to extend their datacenters to Azure. We also added the Storage Migration Service to help migrate file servers and their data to Azure without the need to reconfigure applications or users.

Windows Server 2019 deep dive | Best of Microsoft Ignite 2018


Security: Security continues to be a top priority for our customers. With the security threats growing in number and becoming more and more sophisticated, we continue to keep a persistent focus on security. Our approach to security is three-fold: Protect, Detect, and Respond. We bring security features in all three areas to Windows Server 2019. On the Protect front, we had previously introduced Shielded VMs to protect sensitive virtualized workloads such as Domain Controllers, PCI data, sensitive healthcare, and financial data among others. In Windows Server 2019, we extended support of Shielded VMs to Linux VMs. On the Detect and Respond front, we enabled Windows Defender Advanced Threat Protection (ATP), that detects attacks and zero-day exploits among other capabilities. Windows Server 2019 also includes Defender Exploit Guard to help you elevate the security posture of your IT environment and combat ransomware attacks.

Windows Server 2019 deep dive


Application Platform: A key guiding principle for us on the Windows Server team is a relentless focus on the developer experience. We learned from your feedback, a smaller container image size will significantly improve experience of developers and IT Pros who are modernizing their existing applications using containers. In Windows Server 2019, we reduced the Server Core base container image to a third of its size. We also provide improved app compatibility, support for Service Fabric and  Kubernetes, and support for Linux containers on Windows to help modernize your apps. A feedback we constantly hear from developers is the complexity in navigating environments with Linux and Windows deployments. To address that, we previously extended Windows Subsystem for Linux (WSL) into insider builds for Windows Server, so that customers can run Linux containers side-by-side with Windows containers on a Windows Server. In Windows Server 2019, we are continuing on this journey to improve WSL, helping Linux users bring their scripts to Windows while using industry standards like OpenSSH, Curl & Tar.

Windows 2019


Hyper-converged Infrastructure (HCI): HCI is one of the latest trends in the server industry today. It is primarily because customers understand the value of using servers with high performant local disks to run their compute and storage needs at the same time. In Windows Server 2019, we democratize HCI with cost-effective high-performance software-defined storage and networking that allows deployments to scale from small 2-node, all the way up to 100s of servers with Cluster Sets technology, making it affordable regardless of the deployment scale. Through our Windows Server Software Defined program, we partner with industry leading hardware vendors to provide an affordable and yet extremely robust HCI solution with validated design.

In October, customers will have access to Windows Server 2019 through all the channels! We will publish a blog post to mark the availability of Windows Server 2019 soon.

Are you wondering how Windows Server 2016 vs. 2019 compare? Let’s find out!

Microsoft, the Redmond giant, has recently announced the new version for the Windows Server. Aptly named Windows Server 2019, the service is likely to be available for downloads. In fact, the downloads have been made available for the users of inside builds and should be available for a general roll out quite soon. How does it improve the user experience from the days of Windows Server 2016? Let us get to know through an introduction to the new features on Windows Server 2019.



Windows server 2016 vs. 2019 – What’s the difference?
The Windows Server 2019 was officially announced on March 20, 2018, through communication on officially Windows Server Blog. The new server edition will be available for the general public from the second half of the calendar year 2018. If you want to have the experience before it is possible for everyone else, you may check it out by registering for the Windows Insider Program.

Differentiating the Windows Server 2019 from its predecessor, the Windows Server 2016 should not be an easy task. The latest version of the Windows Server is based on the Windows Server 2016, and thus you would find almost all the features virtually on the similar lines except for the new improvements and optimizations. We will attempt differentiating between the two based on the new features.



Windows Server 2016 has been one of the fastest ever server version from the Redmond giant. The Windows Server 2019 continues from where the 2016 version has left. The primary areas that were selected for the changes and improvements were – Hybrid, Security, Application Platform, and Hyper-converged infrastructure.


Hybrid Cloud Scenario



The Windows Server 2019 uses a hybrid approach for the movement to the Cloud. Unlike the option available on Windows Server 2016, both on-premise and cloud solutions would work together, thus offering an enhanced environment for the users.

The Server 2016 uses Active Directory, file server synchronization and backing up the data in the cloud. The difference lies in the way the Windows Server 2019 lets the on-premises make use of more advanced systems like IoT and Artificial Intelligence. The hybrid approach would ensure that you are future proof and long-term option.

Integration with Project Honolulu offers you a seamless, lightweight and flexible platform for all your needs. If you are using the Cloud Services from Microsoft, the Microsoft Azure, this is something you would indeed love.

New Security Services

Security is yet another feature that has received an impetus from the days of Windows Server 2016. The Server 2016 had been reliant on Shielded VMs. But, what has changed with the new version of the server edition is the additional support for Linux VMs.

Windows Server 2019 introduces new security features with an emphasis on three particular areas that need attention – Protect, Detect and Respond. The Windows Server 2019 brings in a new functionality of extended support of VMConnect for your troubleshooting needs on Shielded VMs for Windows Server and Linux.

There is another added functionality that has been added from the days of Windows Server 2016 is the embedded Windows Defender Advanced Threat Protection. It can perform efficient preventive actions for complete detection of attacks.

Application Platform

Microsoft has been focussing on the enhanced developer experiences. The Windows Server 2019 brings in the new developments in the form of improved Windows Server Containers and the Windows Subsystem for hosting Linux.

Windows Server 2016 has had a good performance concerning the Windows Server Containers. In fact, the concept has had greater success regarding the adoption. Thousands of container images have already been downloaded ever since the launch of 2016 edition of Windows Server. However, Windows Server 2019 edition has been aiming to reduce the size of the server base core of the container image. This is bound to enhance the development and performance remarkably.


Windows Server 2016 introduced support for the robust level of Hyper-Converged Infrastructure or HCI options. It brought in the support from the industry’s leading hardware vendors. Windows Server 2019 taking it ahead from the days of Windows Server 2016.

Windows Server 2019: What’s new and what's next


Yes, the 2019 version brings in a few extra features – extra-scale, performance, reliability and better support for HCI deployment. The Project Honolulu we mentioned above brings in a high-performance interface for Storage Space Direct. However, if you are someone belonging to the small business genre, you would not be able to afford it as of now.

Enterprise-grade hyperconverged infrastructure (HCI)

With the release of Windows Server 2019, Microsoft rolls up three years of updates for its HCI platform. That’s because the gradual upgrade schedule Microsoft now uses includes what it calls Semi-Annual Channel releases – incremental upgrades as they become available. Then every couple of years it creates a major release called the Long-Term Servicing Channel (LTSC) version that includes the upgrades from the preceding Semi-Annual Channel releases.

The LTSC Windows Server 2019 is due out this fall, and is now available to members of Microsoft’s Insider program.

While the fundamental components of HCI (compute, storage and networking) have been improved with the Semi-Annual Channel releases, for organizations building datacenters and high-scale software defined platforms, Windows Server 2019 is a significant release for the software-defined datacenter.

What's new in Active Directory Federation Services (AD FS) in Windows Server 2019


With the latest release, HCI is provided on top of a set of components that are bundled in with the server license. This means a backbone of servers running HyperV to enable dynamic increase or decrease of capacity for workloads without downtime.

Improvements in security

Microsoft has continued to include built-in security functionality to help organizations address an “expect breach” model of security management.  Rather than assuming firewalls along the perimeter of an enterprise will prevent any and all security compromises, Windows Server 2019 assumes servers and applications within the core of a datacenter have already been compromised.

Windows Server 2019 includes Windows Defender Advanced Threat Protection (ATP) that assess common vectors for security breaches, and automatically blocks and alerts about potential malicious attacks.  Users of Windows 10 have received many of the Windows Defender ATP features over the past few months. Including  Windows Defender ATP on Windows Server 2019 lets them take advantage of data storage, network transport and security-integrity components to prevent compromises on Windows Server 2019 systems.

Smaller, more efficient containers

Organizations are rapidly minimizing the footprint and overhead of their IT operations and eliminating more bloated servers with thinner and more efficient containers. Windows Insiders have benefited by achieving higher density of compute to improve overall application operations with no additional expenditure in hardware server systems or expansion of hardware capacity.

Getting started with Windows Server containers in Windows Server 2019


Windows Server 2019 has a smaller, leaner ServerCore image that cuts virtual machine overhead by 50-80 percent.  When an organization can get the same (or more) functionality in a significantly smaller image, the organization is able to lower costs and improve efficiencies in IT investments.

Windows subsystem on Linux

A decade ago, one would rarely say Microsoft and Linux in the same breath as complimentary platform services, but that has changed. Windows Server 2016 has open support for Linux instances as virtual machines, and the new Windows Server 2019 release makes huge headway by including an entire subsystem optimized for the operation of Linux systems on Windows Server.

The Windows Subsystem for Linux extends basic virtual machine operation of Linux systems on Windows Server, and provides a deeper layer of integration for networking, native filesystem storage and security controls. It can enable encrypted Linux virtual instances. That’s exactly how Microsoft provided Shielded VMs for Windows in Windows Server 2016, but now native Shielded VMs for Linux on Windows Server 2019.

Be an IT hero with Storage Spaces Direct in Windows Server 2019


Enterprises have found the optimization of containers along with the ability to natively support Linux on Windows Server hosts can decrease costs by eliminating the need for two or three infrastructure platforms, and instead running them on Windows Server 2019.

Because most of the “new features” in Windows Server 2019 have been included in updates over the past couple years, these features are not earth-shattering surprises.  However, it also means that the features in Windows Server 2019 that were part of Windows Server 2016 Semi-Annual Channel releases have been tried, tested, updated and proven already, so that when Windows Server 2019 ships, organizations don’t have to wait six to 12 months for a service pack of bug fixes.

This is a significant change that is helping organizations plan their adoption of Windows Server 2019 sooner than orgs may have adopted a major release platform in the past, and with significant improvements for enterprise datacenters in gaining the benefits of Windows Server 2019 to meet security, scalability, and optimized data center requirements so badly needed in today’s fast-paced environments.


Windows Server 2019 has the following new features:

  • Windows Subsystem for Linux (WSL)
  • Support for Kubernetes (Beta)
  • Other GUI new features from Windows 10 version 1809.
  • Storage Spaces Direct.
  • Storage Migration Service.
  • Storage Replica.
  • System Insights.
  • Improved Windows Defender.

What is New in Windows Server 2019

Windows Server 2019 has four main areas of investments and below is glimpse of each area.

Hybrid: Windows Server 2019 and Windows Admin Center will make it easier for our customers to connect existing on-premises environments to Azure. With Windows Admin Center it also easier for customers on Windows Server 2019 to use Azure services such as Azure Backup, Azure Site Recovery, and more services will be added over time.
Security: Security continues to be a top priority for our customers and we are committed to helping our customers elevate their security posture. Windows Server 2016 started on this journey and Windows Server 2019 builds on that strong foundation, along with some shared security features with Windows 10, such as Defender ATP for server and Defender Exploit Guard.
Application Platform: Containers are becoming popular as developers and operations teams realize the benefits of running in this new model. In addition to the work we did in Windows Server 2016, we have been busy with the Semi-Annual Channel releases and all that work culminates in Windows Server 2019. Examples of these include Linux containers on Windows, the work on the Windows Subsystem for Linux (WSL), and the smaller container images.
Hyper-converged Infrastructure (HCI): If you are thinking about evolving your physical or host server infrastructure, you should consider HCI. This new deployment model allows you to consolidate compute, storage, and networking into the same nodes allowing you to reduce the infrastructure cost while still getting better performance, scalability, and reliability.

Microsoft SQL Server 2019 

SQL Server 2019 Vision


What’s New in Microsoft SQL Server 2019 

• Big Data Clusters


  • Deploy a Big Data cluster with SQL and Spark Linux containers on Kubernetes
  • Access your big data from HDFS
  • Run Advanced analytics and machine learning with Spark
  • Use Spark streaming to data to SQL data pools
  • Use Azure Data Studio to run Query books that provide a notebook experience

• Database engine


  • UTF-8 support
  • Resumable online index create allows index create to resume after interruption
  • Clustered columnstore online index build and rebuild
  • Always Encrypted with secure enclaves
  • Intelligent query processing
  • Java language programmability extension
  • SQL Graph features
  • Database scoped configuration setting for online and resumable DDL operations
  • Always On Availability Groups – secondary replica connection redirection
  • Data discovery and classification – natively built into SQL Server
  • Expanded support for persistent memory devices
  • Support for columnstore statistics in DBCC CLONEDATABASE
  • New options added to sp_estimate_data_compression_savings
  • SQL Server Machine Learning Services failover clusters
  • Lightweight query profiling infrastructure enabled by default
  • New Polybase connectors
  • New sys.dm_db_page_info system function returns page information

• SQL Server on Linux


  • Replication support
  • Support for the Microsoft Distributed Transaction Coordinator (MSDTC)
  • Always On Availability Group on Docker containers with Kubernetes
  • OpenLDAP support for third-party AD providers
  • Machine Learning on Linux
  • New container registry
  • New RHEL-based container images
  • Memory pressure notification

• Master Data Services


  • Silverlight controls replaced

• Security


  • Certificate management in SQL Server Configuration Manager

• Tools


  • SQL Server Management Studio (SSMS) 18.0 (preview)
  • Azure Data Studio

Introducing Microsoft SQL Server 2019 Big Data Clusters

SQL Server 2019 big data clusters make it easier for big data sets to be joined to the dimensional data typically stored in the enterprise relational database, enabling people and apps that use SQL Server to query big data more easily. The value of the big data greatly increases when it is not just in the hands of the data scientists and big data engineers but is also included in reports, dashboards, and applications. At the same time, the data scientists can continue to use big data ecosystem tools while also utilizing easy, real-time access to the high-value data in SQL Server because it is all part of one integrated, complete system.
Read the complete Awesome blogpost from Travis Wright about SQL Server 2019 Big Data Cluster here: https://cloudblogs.microsoft.com/sqlserver/2018/09/25/introducing-microsoft-sql-server-2019-big-data-clusters/
Starting in SQL Server 2017 with support for Linux and containers, Microsoft has been on a journey of platform and operating system choice. With SQL Server 2019 preview, we are making it easier to adopt SQL Server in containers by enabling new HA scenarios and adding supported Red Hat Enterprise Linux container images. Today we are happy to announce the availability of SQL Server 2019 preview Linux-based container images on Microsoft Container Registry, Red Hat-Certified Container Images, and the SQL Server operator for Kubernetes, which makes it easy to deploy an Availability Group.

SQL Server 2019: Celebrating 25 years of SQL Server Database Engine and the path forward
Awesome work Microsoft SQL Team and Congrats on your 25th Anniversary !
Microsoft announced the preview of SQL Server 2019. For 25 years, SQL Server has helped enterprises manage all facets of their relational data. In recent releases, SQL Server has gone beyond querying relational data by unifying graph and relational data and bringing machine learning to where the data is with R and Python model training and scoring. As the volume and variety of data increases, customers need to easily integrate and analyze data across all types of data.

SQL Server 2019 big data clusters - intro session


Now, for the first time ever, SQL Server 2019 creates a unified data platform with Apache SparkTM and Hadoop Distributed File System (HDFS) packaged together with SQL Server as a single, integrated solution. Through the ability to create big data clusters, SQL Server 2019 delivers an incredible expansion of database management capabilities, further redefining SQL Server beyond a traditional relational database. And as with every release, SQL Server 2019 continues to push the boundaries of security, availability, and performance for every workload with Intelligent Query Processing, data compliance tools and support for persistent memory. With SQL Server 2019, you can take on any data project, from traditional SQL Server workloads like OLTP, Data Warehousing and BI, to AI and advanced analytics over big data.

SQL Server 2017 Deep Dive


SQL Server provides a true hybrid platform, with a consistent SQL Server surface area from your data center to public cloud—making it easy to run in the location of your choice. Because SQL Server 2019 big data clusters are deployed as containers on Kubernetes with a built-in management service, customers can get a consistent management and deployment experience on a variety of supported platforms on-premises and in the cloud: OpenShift or Kubernetes on premises, Azure Kubernetes Service (AKS), Azure Stack (on AKS) and OpenShift on Azure. With Azure Hybrid Benefit license portability, you can choose to run SQL Server workloads on-premises or in Azure, at a fraction of the cost of any other cloud provider.

SQL Server – Insights over all your data

SQL Server continues to embrace open source, from SQL Server 2017 support for Linux and containers to SQL Server 2019 now embracing Spark and HDFS to bring you a unified data platform. With SQL Server 2019, all the components needed to perform analytics over your data are built into a managed cluster, which is easy to deploy and it can scale as per your business needs. HDFS, Spark, Knox, Ranger, Livy, all come packaged together with SQL Server and are quickly and easily deployed as Linux containers on Kubernetes. SQL Server simplifies the management of all your enterprise data by removing any barriers that currently exist between structured and unstructured data.

SQL server 2019 big data clusters - deep dive session


Here’s how we make it easy for you to break down barriers to realized insights across all your data, providing one view of your data across the organization:

Simplify big data analytics for SQL Server users. SQL Server 2019 makes it easier to manage big data environments. It comes with everything you need to create a data lake, including HDFS and Spark provided by Microsoft and analytics tools, all deeply integrated with SQL Server and fully supported by Microsoft. Now, you can run apps, analytics, and AI over structured and unstructured data – using familiar T-SQL queries or people familiar with Spark can use Python, R, Scala, or Java to run Spark jobs for data preparation or analytics – all in the same, integrated cluster.
Give developers, data analysts, and data engineers a single source for all your data – structured and unstructured – using their favorite tools. With SQL Server 2019, data scientists can easily analyze data in SQL Server and HDFS through Spark jobs. Analysts can run advanced analytics over big data using SQL Server Machine Learning Services: train over large datasets in Hadoop and operationalize in SQL Server. Data scientists can use a brand new notebook experience running on the Jupyter notebooks engine in a new extension of Azure Data Studio to interactively perform advanced analysis of data and easily share the analysis with their colleagues.
Break down data silos and deliver one view across all of your data using data virtualization. Starting in SQL Server 2016, PolyBase has enabled you to run a T-SQL query inside SQL Server to pull data from your data lake and return it in a structured format—all without moving or copying the data. Now in SQL Server 2019, we’re expanding that concept of data virtualization to additional data sources, including Oracle, Teradata, MongoDB, PostgreSQL, and others. Using the new PolyBase, you can break down data silos and easily combine data from many sources using virtualization to avoid the time, effort, security risks and duplicate data created by data movement and replication. New elastically scalable “data pools” and “compute pools” make querying virtualized data lighting fast by caching data and distributing query execution across many instances of SQL Server.

“From its inception, the Sloan Digital Sky Survey database has run on SQL Server, and SQL Server also stores object catalogs from large cosmological simulations. We are delighted with the promise of SQL Server 2019 big data clusters, which will allow us to enhance our databases to include all our big data sets. The distributed nature of SQL Server 2019 allows us to expand our efforts to new types of simulations and to the next generation of astronomical surveys with datasets up to 10PB or more, well beyond the limits of our current database solutions.”- Dr. Gerard Lemson, Institute for Data Intensive Engineering and Science, Johns Hopkins University.

Enhanced performance, security, and availability

The SQL Server 2019 relational engine will deliver new and enhanced features in the areas of mission-critical performance, security and compliance, and database availability, as well as additional features for developers, SQL Server on Linux and containers, and general engine enhancements.

Industry-leading performance – The Intelligent Database

The Intelligent Query Processing family of features builds on hands-free performance tuning features of Adaptive Query Processing in SQL Server 2017 including Row mode memory grant feedback, approximate COUNT DISTINCT, Batch mode on rowstore, and table variable deferred compilation.
Persistent memory support is improved in this release with a new, optimized I/O path available for interacting with persistent memory storage.
The Lightweight query profiling infrastructure is now enabled by default to provide per query operator statistics anytime and anywhere you need it.


Advanced security – Confidential Computing

Always Encrypted with secure enclaves extends the client-side encryption technology introduced in SQL Server 2016. Secure enclaves protect sensitive data in a hardware or software-created enclave inside the database, securing it from malware and privileged users while enabling advanced operations on encrypted data.
SQL Data Discovery and Classification is now built into the SQL Server engine with new metadata and auditing support to help with GDPR and other compliance needs.
Certification Management is now easier using SQL Server Configuration Manager.
Mission-critical availability – High uptime

Always On Availability Groups have been enhanced to include automatic redirection of connections to the primary based on read/write intent.
High availability configurations for SQL Server running in containers can be enabled with Always On Availability Groups using Kubernetes.
Resumable online indexes now support create operations and include database scoped defaults.
Developer experience

Enhancements to SQL Graph include match support with T-SQL MERGE and edge constraints.
New UTF-8 support gives customers the ability to reduce SQL Server’s storage footprint for character data.
The new Java language extension will allow you to call a pre-compiled Java program and securely execute Java code on the same server with SQL Server. This reduces the need to move data and improves application performance by bringing your workloads closer to your data.
Machine Learning Services has several enhancements including Windows Failover cluster support, partitioned models, and support for SQL Server on Linux.



Platform of choice
Additional capabilities for SQL Server on Linux include distributed transactions, replication, Polybase, Machine Learning Services, memory notifications, and OpenLDAP support.
Containers have new enhancements including use of the new Microsoft Container Registry with support for RedHat Enterprise Linux images and Always On Availability Groups for Kubernetes.
You can read more about what’s new in SQL Server 2019 in our documentation.

SQL Server 2019 support in Azure Data Studio

Expanded support for more data workloads in SQL Server requires expanded tooling. As Microsoft has worked with users of its data platform we have seen the coming together of previously disparate personas: database administrators, data scientists, data developers, data analysts, and new roles still being defined. These users increasingly want to use the same tools to work together, seamlessly, across on-premises and cloud, using relational and unstructured data, working with OLTP, ETL, analytics, and streaming workloads.

Azure Data Studio offers a modern editor experience with lightning fast IntelliSense, code snippets, source control integration, and an integrated terminal. It is engineered with the data platform user in mind, with built-in charting of query result sets, an integrated notebook, and customizable dashboards. Azure Data Studio currently offers built-in support for SQL Server on-premises and Azure SQL Database, along with preview support for Azure SQL Managed Instance and Azure SQL Data Warehouse.



Azure Data Studio is today shipping a new SQL Server 2019 Preview Extension to add support for select SQL Server 2019 features. The extension offers connectivity and tooling for SQL Server big data clusters, including a preview of the first ever notebook experience in the SQL Server toolset, and a new PolyBase Create External Table wizard that makes accessing data from remote SQL Server and Oracle instances easy and fast.

More information:

https://cloudblogs.microsoft.com/windowsserver/2018/08/15/everything-you-need-to-know-about-windows-server-2019-part-1/

https://cloudblogs.microsoft.com/windowsserver/2018/08/20/everything-you-need-to-know-about-windows-server-2019-part-2/

https://cloudblogs.microsoft.com/windowsserver/2018/09/06/everything-you-need-to-know-about-windows-server-2019-part-3/

https://www.databasejournal.com/tips/sql-server-2019-is-here.html

https://docs.microsoft.com/en-us/sql/sql-server/what-s-new-in-sql-server-ver15?view=sqlallproducts-allversions

https://docs.microsoft.com/en-us/sql/sql-server/sql-server-ver15-release-notes?view=sqlallproducts-allversions

https://www.databasejournal.com/tips/sql-server-2019-is-here.html

https://cloudblogs.microsoft.com/sqlserver/2018/09/25/introducing-microsoft-sql-server-2019-big-data-clusters/

https://cloudblogs.microsoft.com/sqlserver/2018/09/24/sql-server-2019-preview-combines-sql-server-and-apache-spark-to-create-a-unified-data-platform/

https://mountainss.wordpress.com/2018/09/27/microsoft-sql-server-2019-preview-overview-sql-sql2019-linux-containers-msignite/

https://medium.com/@jgperrin/microsoft-sql-server-2019-gets-a-spark-baf2cd26bdec

https://docs.microsoft.com/en-us/windows-server/remote/remote-desktop-services/desktop-hosting-reference-architecture

https://www.networkworld.com/article/3265052/data-center/top-6-features-in-windows-server-2019.html

https://www.anoopcnair.com/windows-server-2019-webinar/

https://www.vembu.com/blog/windows-server-2019-preview/

https://redmondmag.com/articles/2018/08/01/windows-server-2019-build-ledbat.aspx

https://rcpmag.com/articles/2011/02/01/the-2011-microsoft-product-roadmap.aspx

https://docs.microsoft.com/en-us/windows-server/networking/sdn/technologies/containers/container-networking-overview

https://blogs.windows.com/windowsexperience/2018/08/28/announcing-windows-server-2019-insider-preview-build-17744/

https://cloudblogs.microsoft.com/windowsserver/2018/09/24/windows-server-2019-announcing-general-availability-in-october/

https://www.gigxp.com/comparison-of-windows-server-2016-vs-2019-whats-the-difference/




Powering IT’s future while preserving the present: Introducing Red Hat Enterprise Linux 8

$
0
0




Powering IT’s future while preserving the present: Introducing Red Hat Enterprise Linux 8: Red Hat Enterprise Linux multi-year roadmap


Red Hat Enterprise Linux multi-year roadmap




Red Hat Enterprise Linux 8 (RHEL 8) has not been released, but, the beta was released on November 14 for you to get your hands dirty on the new version of world’s best enterprise operating system. This release came after IBM acquired Red Hat for $34 billion on October 28, 2018.  https://www.itzgeek.com/how-tos/linux/centos-how-tos/red-hat-enterprise-linux-8-release-date-and-new-features.html

Meet Red Hat Enterprise Linux 8

Linux containers, Kubernetes, artificial intelligence, blockchain and too many other technical breakthroughs to list all share a common component - Linux, the same workhorse that has driven mission-critical, production systems for nearly two decades. Today, we’re offering a vision of a Linux foundation to power the innovations that can extend and transform business IT well into the future: Meet Red Hat Enterprise Linux 8.

Microservices with Docker, Kubernetes, and Jenkins


Enterprise IT is evolving at a pace faster today than at any other point in history. This reality necessitates a common foundation that can span every footprint, from the datacenter to multiple public clouds, enabling organizations to meet every workload requirement and deliver any app, everywhere.

With Red Hat Enterprise Linux 8, we worked to deliver a shared foundation for both the emerging and current worlds of enterprise IT. The next generation of the world’s leading enterprise Linux platform helps fuel digital transformation strategies across the hybrid cloud, where organizations use innovations like Linux containers and Kubernetes to deliver differentiated products and services. At the same time, Red Hat Enterprise Linux 8 Beta enables IT teams to optimize and extract added value from existing technology investments, helping to bridge demands for innovation with stability and productivity.

Sidecars and a Microservices Mesh


In the four years since Red Hat Enterprise Linux 7 redefined the operating system, the IT world has changed dramatically and Red Hat Enterprise Linux has evolved with it. Red Hat Enterprise Linux 8 Beta once again sets a bar for how the operating system can enable IT innovation. While Red Hat Enterprise Linux 8 Beta features hundreds of improvements and dozens of new features, several key capabilities are designed to help the platform drive digital transformation and fuel hybrid cloud adoption without disrupting existing production systems.

Your journey into the serverless world



Red Hat Enterprise Linux 8 introduces the concept of Application Streams to deliver userspace packages more simply and with greater flexibility. Userspace components can now update more quickly than core operating system packages and without having to wait for the next major version of the operating system. Multiple versions of the same package, for example, an interpreted language or a database, can also be made available for installation via an application stream. This helps to deliver greater agility and user-customized versions of Red Hat Enterprise Linux without impacting the underlying stability of the platform or specific deployments.

Red Hat Enterprise Linux roadmap 2018


Beyond a refined core architecture, Red Hat Enterprise Linux 8 also enhances:

Networking

Red Hat Enterprise Linux 8 Beta supports more efficient Linux networking in containers through IPVLAN, connecting containers nested in virtual machines (VMs) to networking hosts with a minimal impact on throughput and latency. It also includes a new TCP/IP stack with Bandwidth and Round-trip propagation time (BBR) congestion control, which enables higher performance, minimized latency and decreased packet loss for Internet-connected services like streaming video or hosted storage.

Security

As with all versions of Red Hat Enterprise Linux before it, Red Hat Enterprise Linux 8 Beta brings hardened code and security fixes to enterprise users, along with the backing of Red Hat’s overall software security expertise. With Red Hat Enterprise Linux 8 Beta, our aim is to deliver a more secure by default operating system foundation across the hybrid cloud.

Serverless and Servicefull Applications - Where Microservices complements Serverless



OpenSSL 1.1.1 and TLS 1.3 are both supported in Red Hat Enterprise Linux 8, enabling server applications on the platform to use the latest standards for cryptographic protection of customer data. System-wide Cryptographic Policies are also included, making it easier to manage cryptographic compliance from a single prompt without the need to modify and tune specific applications.

Linux containers

Red Hat set a standard when we introduced enterprise support for Linux containers in Red Hat Enterprise Linux 7. Now, Linux containers have become a critical component of digital transformation, offering a roadmap for more portable and flexible enterprise applications, and Red Hat remains at the forefront of this shift with Red Hat Enterprise Linux 8.

Red Hat’s lightweight, open standards-based container toolkit is now fully supported and included with Red Hat Enterprise Linux 8. Built with enterprise IT security needs in mind, Buildah (container building), Podman (running containers) and Skopeo (sharing/finding containers) help developers find, run, build and share containerized applications more quickly and efficiently, thanks to the distributed and daemonless nature of the tools.

FaaS and Furious - 0 to Serverless in 60 Seconds, Anywhere - Alex Ellis, ADP


Systems management

The growth of Linux in corporate datacenters requires management and, frequently, new systems administrators are faced with managing complex system footprints or performing difficult tasks that are outside of their comfort zones. Red Hat Enterprise Linux 8 aims to make it easier on systems administrators of all experience levels with several quality of life improvements, starting with a single and consistent user control panel through the Red Hat Enterprise Linux Web Console. This provides a simplified interface to more easily manage Red Hat Enterprise Linux servers locally and remotely, including virtual machines.

Camel Riders in the Cloud


Red hat enterprise linux roadmap


Composer makes it easier for both new and experienced Red Hat Enterprise Linux users to build and deploy custom images across the hybrid cloud - from physical and virtualized environments to private and public cloud instances. Using a straightforward graphical interface, Composer simplifies access to packages as well as the process for assembling deployable images. This means that users can more readily create Red Hat Enterprise Linux-based images, from minimal footprint to specifically optimized, for a variety of deployment models, including virtual machines and cloud environments.

Istio canaries and kubernetes



Yum 4, the next generation of the Yum package manager in Red Hat Enterprise Linux, delivers faster performance, fewer installed dependencies and more choices of package versions to meet specific workload requirements.


Lightning Talk: The State Of FaaS on Kubernetes - Michael Hausenblas, Red Hat


File systems and storage

New to Red Hat Enterprise Linux 8 Beta is Stratis, a volume-managing file system for more sophisticated data management. Stratis abstracts away the complexities inherent to data management via an API, enabling these capabilities without requiring systems administrators to understand the underlying nuances, delivering a faster and more efficient file system.

File System Snapshots provide for a faster way of conducting file-level tasks, like cloning virtual machines, while saving space by consuming new storage only when data changes. Support for LUKSv2 to encrypt on-disk data combined with Network-Bound Disk Encryption (NBDE) for more robust data security and more simplified access to encrypted data.

IBM Acquires RedHat: Creating World Leading Hybrid Clod Provider. Ibm red-hat-charts-10-2018


Test the future

We don’t just want to tell you what makes Red Hat Enterprise Linux 8 Beta a foundation for the future of IT. We want you to experience it. Existing customers and subscribers are invited and encouraged to test Red Hat Enterprise Linux 8 Beta for themselves to see how they can deploy applications with more flexibility, more confidence and more control. Developers can also see the future of the world’s leading enterprise Linux platform through the Red Hat Developer Program. If you are new to Red Hat Enterprise Linux, please visit the Red Hat Enterprise Linux 8 Public Beta download site and view the README file for instructions on how to download and install the software.

https://access.redhat.com/products/red-hat-enterprise-linux/beta

Gartner predicts that, by 2020, more than 50% of global organizations will be running containerized applications in production, up from less than 20% today.* This means to us that developers need to be able to more quickly and easily create containerized applications. It’s this challenge that the Buildah project, with the release of version 1.0, aims to solve by bringing new innovation to the world of container development.


IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."


While Linux containers themselves present a path to digital transformation, the actual building of these containers isn’t quite so clear. Typically, building a Linux container image requires the use of an extensive set of tools and daemons (a container engine, so to speak). The existing tools are bulky by container standards and I believe there has been a distinct lack of innovation. IT teams may want their build systems running the bare minimum of processes and tools, otherwise, additional complexity can be introduced that could lead to loss of system stability and even security risks. Complexity is a serious architectural and security challenge.



This is where Buildah comes in. A command line utility, Buildah provides only the basic requirements needed to create or modify Linux container images making it easier to integrate into existing application build pipelines.

The resulting container images are not snowflakes, either; they are OCI-compliant and can even be built using Dockerfiles. Buildah is a distillation of container development to the bare necessities, designed to help IT teams to limit complexity on critical systems and streamline ownership and security workflows.

OpenShift Commons Briefing #122: State of FaaS on Kubernetes - Michael Hausenblas (Red Hat)



When we say “bare necessities,” we mean it. Buildah allows for the on-the-fly creation of containers from scratch—think of it as an empty box. For example, Buildah can assemble containers that omit things like package managers (DNF/YUM), that are not required by the final image. So not only can Buildah provide the capability to build these containers in a less complex and more secure fashion, it can cut bloat (and therefore image size) and extend customization to what you need in your cloud-native applications.

Since Buildah is daemonless, it is easier to run it in a container without setting up special infrastructure on the host or “leaking” host sockets into the container. You can run Buildah inside of your Kubernetes (or enterprise Kubernetes, like Red Hat OpenShift) cluster.

On-premises FaaS on Kubernetes


What’s special about Buildah 1.0

We’ve talked about Buildah before, most notably launching full, product-level support for it in Red Hat Enterprise Linux 7.5. Now that 1.0 has hit the community, here are a few of the notable features in Buildah that make it interesting:

Buildah has added external read/write volumes during builds, which enables users to build container images that reference external volumes while being built, but without having to ship those external volumes in the completed image. This helps to simplify image creation without bloating those images with unnecessary and unwanted artifacts in production.

To enhance security, Buildah can help the resulting images better comply with Federal Information Processing Standards (FIPS), computer systems standards required by the U.S. Federal Government for non-military, governmental operations, with support for FIPS mode. When a host is running in FIPS mode, Buildah can build and run containers in FIPS mode as well, making it easier for containers on hosts running in FIPS mode to comply with the standards.


  • Buildah now also offers multi-stage builds, multiple container transport methods for pulling and pushing images, and more. By focusing solely on building and manipulating container images, 
  • Buildah is a useful tool for anyone working with Linux containers. Whether you’re a developer testing images locally or looking for an independent image builder for a production toolchain, 
  • Buildah is a worthy addition to your container toolbelt.


Want to start building with Buildah yourself?

Try `yum -y install buildah` or learn more and contribute at the project site: https://github.com/projectatomic/buildah.

You can also see a more detailed example at https://www.projectatomic.io/blog/2018/03/building-buildah-container-image-for-kubernetes/.

*Smarter with Gartner, 6 Best Practices for Creating a Container Platform Strategy, October 31, 2017, https://www.gartner.com/smarterwithgartner/6-best-practices-for-creating-a-container-platform-strategy/

6 Best Practices for Creating a Container Platform Strategy

Gartner has identified six key elements that should be part of a container platform strategy to help I&O leaders mitigate the challenges of deploying containers in production environments:


  1. Security and governance - Security is a particularly challenging issue for production container deployments. The integrity of the shared host OS kernel is critical to the integrity and isolation of the containers that run on top of it. A hardened, patched, minimalist OS should be used as the host OS, and containers should be monitored on an ongoing basis for vulnerabilities and malware to ensure a trusted service delivery.
  2. Monitoring - The deployment of cloud-native applications shifts the focus to container-specific and service-oriented monitoring (from host-based) to ensure compliance with resiliency and performance service-level agreements. “It’s therefore important to deploy packaged tools that can provide container and service-level monitoring, as well as linking container monitoring tools to the container orchestrators to pull in metrics on other components for better visualization and analytics,” says Chandrasekaran.
  3. Storage - Since containers are transient, the data should be disassociated from the container so that the data persists and is protected even after the container is spun down. Scale-out software-defined storage products can solve the problem of data mobility, the need for agility and simultaneous access to data from multiple application containers.
  4. Networking - The portability and short-lived life cycle of containers will overwhelm the traditional networking stack. The native container networking stack doesn’t have robust-enough access and policy management capabilities. “I&O teams must therefore eliminate manual network provisioning within containerized environments, enable agility through network automation and provide developers with proper tools and sufficient flexibility,” Chandrasekaran says.
  5. Container life cycle management - Containers present the potential for sprawl even more severe than many virtual machine deployments caused. This complexity is often intensified by many layers of services and tooling. Container life cycle management can be automated through a close tie-in with continuous integration/continuous delivery processes together with continuous configuration automation tools to automate infrastructure deployment and operational tasks.
  6. Container orchestration - Container management tools are the “brains” of a distributed system, making decisions on discovery of infrastructure components making up a service, balancing workloads with infrastructure resources, and provisioning and deprovisioning infrastructures, among other things. “The key decision here is whether hybrid orchestration for container workloads is required or if it is sufficient to provision based on use case and manage multiple infrastructure silos individually,” Chandrasekaran says.

Jaeger Project Intro - Juraci Kröhling, Red Hat (Any Skill Level)



More Information:

https://www.itzgeek.com/how-tos/linux/centos-how-tos/red-hat-enterprise-linux-8-release-date-and-new-features.html

https://developers.redhat.com/blog/2018/11/15/red-hat-enterprise-linux-8-beta-is-here/

Red Hat Enterprise Linux 8 – Release Date and New Features @itzgeek https://www.itzgeek.com/how-tos/linux/centos-how-tos/red-hat-enterprise-linux-8-release-date-and-new-features.html

https://access.redhat.com/discussions/3534521

https://www.redhat.com/en/blog/powering-its-future-while-preserving-present-introducing-red-hat-enterprise-linux-8-beta#

https://developers.redhat.com/blog/2018/11/15/red-hat-enterprise-linux-8-beta-is-here/

https://www.redhat.com/en/blog/daemon-haunted-container-world-no-longer-introducing-buildah-10

https://developers.redhat.com/articles/podman-next-generation-linux-container-tools/

https://blog.openshift.com/promoting-container-images-between-registries-with-skopeo/

https://access.redhat.com/products/red-hat-enterprise-linux/

https://www.alibabacloud.com/partners/redhat

https://developers.redhat.com/blog/2018/11/15/red-hat-enterprise-linux-8-beta-is-here/








Viewing all 117 articles
Browse latest View live