語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Data-driven approaches to improve de...
~
Potharaju, Rahul.
FindBook
Google Book
Amazon
博客來
Data-driven approaches to improve dependability of cloud services.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Data-driven approaches to improve dependability of cloud services./
作者:
Potharaju, Rahul.
面頁冊數:
158 p.
附註:
Source: Dissertation Abstracts International, Volume: 76-02(E), Section: B.
Contained By:
Dissertation Abstracts International76-02B(E).
標題:
Computer Science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3636490
ISBN:
9781321181326
Data-driven approaches to improve dependability of cloud services.
Potharaju, Rahul.
Data-driven approaches to improve dependability of cloud services.
- 158 p.
Source: Dissertation Abstracts International, Volume: 76-02(E), Section: B.
Thesis (Ph.D.)--Purdue University, 2014.
This item must not be sold to any third party vendors.
The growing demand for always-on and low-latency cloud services is driving the creation of globally distributed datacenters. A major factor affecting service availability is reliability of the network, both inside the datacenters and wide-area links connecting them. While several research efforts focus on building scale-out datacenter networks, little has been reported on real network failures and how they impact geo-distributed services. Towards improving the dependability of the underlying datacenter networks, in this dissertation, we make one of the first attempts to characterize intra-datacenter and inter-datacenter network failures from a service perspective. Specifically, we make the following contributions: 1. Analysis Methodology for Structured Data: Our dataset includes multiple sources of structured network telemetry data spanning three years logged in monitoring servers of a large cloud provider comprising 100k+ servers, 10k+ core network devices, 2k+ middleboxes and 100k+ network links across 10+ datacenters. This dataset covers a wide range of network data sources, including syslog and SNMP alerts, and traffic carried by links. To this end, we describe a systematic methodology for analyzing this structured data based on event processing to extract events having service-level impact. 2. Analysis Methodology for Unstructured Data Our dataset also includes an important piece of operational knowledge -- network trouble tickets, which are diaries written by network operators to keep track of their troubleshooting efforts while fixing a problem. To this end, we take a practical step towards automatically analyzing natural language text in network trouble tickets to infer the problem symptoms, troubleshooting activities and resolution actions. Our system, NetSieve combines statistical natural language processing (NLP), knowledge representation, and ontology modeling to achieve these goals. 3. Data-Driven Approaches to Deriving Actionable Insights: Our overarching goal in this dissertation is to enable operators to understand global problem trends instead of making decisions based on isolated incidents. We outline several analyses rooted in reliability analysis and applied statistics for characterizing network failures and deriving actionable insights from them. Our study reveals several important findings on (a) the failure characteristics of network elements, (b) the availability of network domains, (c) service impact, (d) causes of network failures, (e) effectiveness of repairs, and (f) modeling failures.
ISBN: 9781321181326Subjects--Topical Terms:
626642
Computer Science.
Data-driven approaches to improve dependability of cloud services.
LDR
:04889nmm a2200325 4500
001
2056296
005
20150505071908.5
008
170521s2014 ||||||||||||||||| ||eng d
020
$a
9781321181326
035
$a
(MiAaPQ)AAI3636490
035
$a
AAI3636490
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Potharaju, Rahul.
$3
3170043
245
1 0
$a
Data-driven approaches to improve dependability of cloud services.
300
$a
158 p.
500
$a
Source: Dissertation Abstracts International, Volume: 76-02(E), Section: B.
500
$a
Adviser: Cristina Nita-Rotaru.
502
$a
Thesis (Ph.D.)--Purdue University, 2014.
506
$a
This item must not be sold to any third party vendors.
520
$a
The growing demand for always-on and low-latency cloud services is driving the creation of globally distributed datacenters. A major factor affecting service availability is reliability of the network, both inside the datacenters and wide-area links connecting them. While several research efforts focus on building scale-out datacenter networks, little has been reported on real network failures and how they impact geo-distributed services. Towards improving the dependability of the underlying datacenter networks, in this dissertation, we make one of the first attempts to characterize intra-datacenter and inter-datacenter network failures from a service perspective. Specifically, we make the following contributions: 1. Analysis Methodology for Structured Data: Our dataset includes multiple sources of structured network telemetry data spanning three years logged in monitoring servers of a large cloud provider comprising 100k+ servers, 10k+ core network devices, 2k+ middleboxes and 100k+ network links across 10+ datacenters. This dataset covers a wide range of network data sources, including syslog and SNMP alerts, and traffic carried by links. To this end, we describe a systematic methodology for analyzing this structured data based on event processing to extract events having service-level impact. 2. Analysis Methodology for Unstructured Data Our dataset also includes an important piece of operational knowledge -- network trouble tickets, which are diaries written by network operators to keep track of their troubleshooting efforts while fixing a problem. To this end, we take a practical step towards automatically analyzing natural language text in network trouble tickets to infer the problem symptoms, troubleshooting activities and resolution actions. Our system, NetSieve combines statistical natural language processing (NLP), knowledge representation, and ontology modeling to achieve these goals. 3. Data-Driven Approaches to Deriving Actionable Insights: Our overarching goal in this dissertation is to enable operators to understand global problem trends instead of making decisions based on isolated incidents. We outline several analyses rooted in reliability analysis and applied statistics for characterizing network failures and deriving actionable insights from them. Our study reveals several important findings on (a) the failure characteristics of network elements, (b) the availability of network domains, (c) service impact, (d) causes of network failures, (e) effectiveness of repairs, and (f) modeling failures.
520
$a
As part of this dissertation, we have built a broad range of systems including real-time network dashboards, a big data analytics system for analyzing network telemetry data, and an inference tool for root cause analysis in network troubleshooting. Several components of the dissertation work either have undergone a tech-transfer or are being used by multiple business groups inside Microsoft. NetWiser, a Microsoft Research project entailing this dissertation, was awarded the Microsoft Trustworthy Computing Reliability Award for 2013.
520
$a
The problem inference system part of this dissertation, NetSieve, is currently being used across different teams within Microsoft to improve network management: the Network Architecture team for comparing device reliability across platforms and vendors, the Capacity Planning team for understanding why network redundancy is ineffective in masking failures, and the Incident Management and Operations team for finding the top-k problems and failing components while troubleshooting devices and determining whether past repairs were effective. Since its inception, NetSieve has also been used to automate root cause analysis of security incidents within Microsoft's datacenters and recently found its way into commercial use through Microsoft's System Center Advisor (http://www.systemcenteradvisor.com).
590
$a
School code: 0183.
650
4
$a
Computer Science.
$3
626642
650
4
$a
Information Science.
$3
1017528
650
4
$a
Information Technology.
$3
1030799
690
$a
0984
690
$a
0723
690
$a
0489
710
2
$a
Purdue University.
$b
Computer Sciences.
$3
1019069
773
0
$t
Dissertation Abstracts International
$g
76-02B(E).
790
$a
0183
791
$a
Ph.D.
792
$a
2014
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3636490
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9288775
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入