すずけんメモ

技術メモです

AWS re:Invent 2013で参加したセッション&聴きたかったセッションまとめ

re:Inventで聴いたセッションの備忘録。時系列順。ついでに聴きたかったけど聴けなかったのであとでビデオみるセッションも書いておく。

  • 2013/11/25 13:17 Sessionに関するスライドと動画を見つけたものについてのみ更新

2013/11/12

  • AWS re:Invent Gameday

SQSで画像加工のキューイングしてs3にデプロイするというシンプルなシステムをテーマに。敵チームのAWSアカウントのpower user権限をもらって嫌らしい攻撃を行うというゲーム。楽しかったけどチームによって偏りがあったかなぁという印象。一応攻撃としては以下のことをやってみた。

  • auto-sclaingグループをこっそり書き換え
  • s3のバケットのpermission変更とexpire設定
  • security group設定を変更

一番嫌らしい攻撃はec2のkernel idを書き換えていたチームだった気がする。ちゃんとメモれたら(思い出せたら)レポーティングする。

2013/11/13

Day 1 Keynote

ARC310 - Orchestration and Deployment Options for Hybrid Enterprise Environments

"Configure once, deploy anywhere" is one of the most sought-after enterprise operations requirements. Large-scale IT shops want to keep the flexibility of using on-premises and cloud environments simultaneously while maintaining the monolithic custom, complex deployment workflows and operations. This session brings together several hybrid enterprise requirements and compares orchestration and deployment models in depth without a vendor pitch or a bias. This session outlines several key factors to consider from the point of view of a large-scale real IT shop executive. Since each IT shop is unique, this session compares strengths, weaknesses, opportunities, and the risks of each model and then helps participants create new hybrid orchestration and deployment options for the hybrid enterprise environments.

Donn Morrill - Manager, Solutions Architecture, Amazon Web Services

EnterpriseにおいてオンプレとAWSをマイグレートしつついい感じにデプロイする話。ちょっと期待していた内容と違った。

SVC202 -- Asgard, Aminator, Simian Army and more: How Netflix’s Proven Tools Can Help Accelerate Your Start-p

You're on the verge of a new startup and you need to build a world-class, high-scale web application on AWS so it can handle millions of users. How do you build it quickly without having to reinvent and re-implement the best-practices of large successful Internet companies? NetflixOSS is your answer. In this session, we’ll cover how an emerging startup can leverage the different open source tools that Netflix has developed and uses every day in production, ranging from baking and deploying applications (Asgard, Aminator), to hardening resiliency to failures (Hystrix, Simian Army, Zuul), making them highly distributed and load balanced (Eureka, Ribbon, Archaius) and managing your AWS resources efficiently and effectively (Edda, Ice). You’ll learn how to get started using these tools, learn best practices from engineers who actually created them, so, like Netflix, you can too unleash the power of AWS and scale your application processes as you grow.

Adrian Cockcroft - Director of Architecture, Cloud Systems, Netflix Ruslan Meshenberg - Director, Platform Engineering, Netflix

Netflix OSSの総ざらい的な説明。結論から言うと、 http://netflix.github.io を見ればよい。でも概論的に各種OSSについて聴けたのはよかった。あとすごく混んでた。

BDT401 - Using AWS to Build a Scalable Big Machine Data Management and Processing Service

By turning the data center into an API, AWS has enabled Sumo Logic to build a very large scale IT operational analytics platform as a service at unprecedented scale and velocity. Based around Amazon EC2 and Amazon S3, the Sumo Logic system is ingesting many terabytes of unstructured log data a day while at the same time delivering real-time dashboards and supporting hundreds of thousands of queries against the collected data. When co-founder and CTO Christian Beedgen started Sumo Logic, it was obvious that the service would have to scale quickly and elastically, and AWS has been providing the perfect infrastructure for this endeavor from the start.

In this talk, Christian dives into the core Sumo Logic architecture and explains which AWS services are making Sumo Logic possible. Based around an in-house developed automation and continuous deployment system, Sumo Logic is leveraging Amazon S3 in particular for large-scale data management and Amazon DynamoDB for cluster configuration management. By relying on automation, Sumo Logic is also able to perform sophisticated staging of new code for rapid deployment. Using the log-based instrumentation of the Sumo Logic codebase, Christian will dive into the performance characteristics achieved by the system today and share war stories about lessons learned along the way.

Christian Beedgen - CTO & Co-Founder, Sumo Logic

Sumo Logicの解析インフラの話。これは勉強になった。あとでメモをまとめたい。

BDT306 - Data Science at Netflix with Amazon EMR

A few years ago, Netflix had a fairly "classic" business intelligence tech stack. Things have definitely changed. Netflix is a heavy user of AWS for much of its ongoing operations, and Data Science & Engineering (DSE) is no exception. In this talk, we dive into the Netflix DSE architecture: what and why. Key topics include their use of Big Data technologies (Cassandra, Hadoop, Pig + Python, and Hive); their Amazon S3 central data hub; their multiple persistent Amazon EMR clusters; how they benefit from AWS elasticity; their data science-as-a-service approach, how they made a hybrid AWS/data center setup work well, their open-source Hadoop-related software, and more.

Kurt Brown - Director, Data Platform, Netflix

NetflixのKurtさんによるNetflixのEMR利用事例。これは非常に濃かった。Lipstick便利そう。

2013/11/14

Day 2 Keynote

SPOT401 - Leading the NoSQL Revolution: Under the Covers of Distributed Systems at Scale

The Dynamo paper started a revolution in distributed systems. The contributions from this paper are still impacting the design and practices of some of the world's largest distributed systems, including those at Amazon.com and beyond. Building distributed systems is hard, but our goal in this session is to simplify the complexity of this topic to empower the hacker in you! Have you been bitten by the eventual consistency bug lately? We show you how to tame eventual consistency and make it a great scaling asset. As you scale up, you must be ready to deal with node, rack, and data center failure. We share insights on how to limit the blast radius of the individual components of your system, battle tested techniques for simulating failures (network partitions, data center failure), and how we used core distributed systems fundamentals to build highly scalable, performance, durable, and resilient systems. Come watch us uncover the secret sauce behind Amazon DynamoDB, Amazon SQS, Amazon SNS, and the fundamental tenents that define them as Internet scale services. To turn this session into a hacker's dream, we go over design and implementation practices you can follow to build an application with virtually limitless scalability on AWS within an hour. We even share insights and secret tips on how to make the most out of one of the services released during the morning keynote.

Swami Sivasubramanian - General Manager, Amazon Web Services Khawaja Shams - Technical Advisor, Amazon Web Services

AWSのDynamoDBの中の人がAWSでの開発の裏側について話してくれたセッション。Keynoteでも触れられていたようなAmazonのカルチャーの話はやはり興味深かった。

DAT204 -- SmugMug: From MySQL to Amazon DynamoDB

SmugMug.com is a popular hosting and commerce platform for photo enthusiasts with hundreds of thousands of subscribers and millions of viewers. Learn now SmugMug uses Amazon DynamoDB to provide customers detailed information about millions of daily image and video views. Smugmug shares code and information about their stats stack, which includes an HTTP interface to Amazon DynamoDB and also interfaces with their internal PHP stack and other tools such as Memcached. Get a detailed picture of lessons learned and the methods SmugMug uses to create a system that is easy to use, reliable, and high performing.

Brad Clawsie - Software Engineer, SmugMug

MySQLからDynamoDBに移行したSmugMugの話。MySQLがそもそもkey-valueストア的な使い方を最初からされていたのでDynamoDBに移行するのはそんなに大変ではないというのを最初に聴いて出鼻をくじかれたものの、その後のDynamoDB用のproxyを作った話のところは面白かった。

DAT306 - How Amazon.com, with One of the World’s Largest Data Warehouses, Is Leveraging Amazon Redshift

Learn how Amazon’s enterprise data warehouse, one of the world's largest data warehouses managing petabytes of data, is leveraging Amazon Redshift. Learn about Amazon's enterprise data warehouse best practices and solutions, and how they’re using Amazon Redshift technology to handle design and scale challenges.

Erik Selberg - Director, Amazon Enterprise Data Warehouse, Amazon.com Abhishek Agrawal - Development Manager - DW Redshift Integration, Amazon Web Services Adam Duncan - Technical Program Manager, Enterprise Data Warehouse, Amazon.com

Amazon.com のRedshift導入話。おそらくRedshift利用しているチームは必見。Amazon.com はAWSのサービスでも優先して使うということはなく、「我々にとってその選択がベストであれば」使うとのことだった。Amazonらしい。今はDWHの半分ほどをRedshiftで置き換えたとのこと。時間つくってメモをまとめたい。

BDT301 - Scaling your Analytics with Amazon Elastic MapReduce

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. Join the General Manager of Amazon EMR, Peter Sirota, to learn how to scale your analytics, use Hadoop with Amazon EMR, write queries with Hive, develop real world data flows with Pig, and understand the operational needs of a production data platform.

Peter Sirota - Sr Manager, Software Development, Amazon Web Services Bob Harris - CTO, Channel 4 Television Eva Tse - Director of Big Data Platform, Netflix

EMRのスケーリングに関する話。NetflixのEvaさんの話がやはり興味深かった。Netflix OSS無双だ。あとAWSの中の人がそろそろImpalaサポートするよって言ってたけど時期は定かではない。

MBL307 - How Parse Built a Mobile Backend as a Service on AWS

Parse is a BaaS for mobile developers that is built entirely on AWS. With over 150,000 mobile apps hosted on Parse, the stability of the platform is our primary concern, but it coexists with rapid growth and a demanding release schedule. This session is a technical discussion of the current architecture and the design decisions that went in to scaling the platform rapidly and robustly over the past year and a half. We talk about some of the lessons learned managing and scaling MongoDB, Cassandra, Redis, and MySQL in the cloud. We also discuss how Parse went from launching individual instances using chef to managing clusters of hosts with Auto Scaling groups, with instance discovery and registry handled by ZooKeeper, thus enabling us to manage vastly larger sets of services with fewer human resources. This session is useful to anyone who is trying to scale up from startup to established platform without sacrificing agility.

Charity Majors - Production Engineering Manager, Parse

これは既にメモった。良セッションだった。Charityさんのプレゼンが楽しそうだった。

re:InventでのParseのDevOps話がとても良かったのでまとめておく - すずけんメモ

re:Play Party

Intelがスポンサーの驚くべき大規模な飲み会、というかクラブだった。

2013/11/14

STG306 - Dropbox presents Cloud Storage for App Developers

It’s more important than ever to create apps that provide an amazing user experience across multiple platforms, devices, and even offline. Even though saving data in cloud storage is becoming ubiquitous, it's still difficult for developers to manage user authentication, syncing, and caching. In this session, you'll learn how Dropbox addresses these challenges, how they leverage the AWS platform, and what tools they provide to developers on the Dropbox platform.

Steve Marx - Developer Advocate, Dropbox

Dropboxの中の人が早朝から熱くstorageについて語ってくれるのかと思っていたけど、Dropbox APIの説明に終始していて期待していたものと違った。

BDT311 - New Launch: Supercharge Your Big Data Infrastructure with Amazon Kinesis: Learn to Build Real-time Streaming Big data Processing Applications

This presentation provides an overview of the technical architecture of Kinesis, the new AWS service for real-time streaming big data ingestion and processing. This is done as part of describing how to implement a sample application that processes a Kinesis stream. The talk also describes how data ingested through Kinesis can be easily filtered, transformed, and uploaded into a variety of AWS storage services, such as S3 and Redshift.

John Dunagan - Principal Algorithm Engineer, Amazon Web Services Ryan Waite - General Manager, Data Services, Amazon Web Services Marvin Theimer - Vice President, Distinguished Engineer, Amazon Web Services

個人的に今回データ発表のなかで最も注目しているサービスのAmazon Kinesisに関するセッション。分散ストリーム処理に関する良い説明だった。これはあとでメモる、予定。日本でのリリースは3ヶ月くらいかかるんじゃないかなーとQA聴いてて思ったけど定かではない。

ARC303 - Unmeltable Infrastructure at Scale: Using Apache Kafka, Twitter Storm, and Elastic Search on AWS

This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity.

The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine.

Philip O'Toole - Senior Architect and Lead Developer, Loggly Jim Nisbet - CTO and VP of Engineering, Loggly

これも既にまとめた。アーキテクチャ、そして実事例として良セッションだったと思う。

re:InventでのLogglyの分散ストリーム処理環境に関するセッションが面白かったのでまとめておく - すずけんメモ

聴きたかったけど聴けなかったセッション

あとでビデオで見るか、誰かがブログにまとめてくれるのを待つ。それかAWSの中の人に聴く。

2013/11/13

CI話はビデオで見ようかと思ってる。

2013/11/14

Graph-basedな推薦システムのアーキテクチャは気になる。あとはCLIのところは普段から使えそうなtipsがありそう。Netflixの話は多分かなりモヒカンだと予想している。

2013/11/15

YelpのセッションはTLの #reinvent をみていると良セッションな雰囲気がした。ベンチマークの話も盛り上がっていた印象。あとDAT304のDynamoの話は特に聴きたかった。そして圧倒的に尖っていそうなMMO話は興味本位で聴きたい。