Download subtitle and video from Youtube

Quicky way to download subtitle and video from Youtube: go to https://thetv.info/[youtube-id]
E.g. Change https://youtu.be/6I2BnX32qNQ to https://thetv.info/6I2BnX32qNQ

Video Transforming B2B Sales with Spark-Powered Sales Intelligence Songtao Guo

TheTV.info
41:48   |   117 views   |   today at 11:25

Transcription

  • alright let's get started so I'm very
  • pleased to do some tall anyway from
  • LinkedIn and they will present their
  • work on leveraging spark to drive better
  • p2p sales strategy right thanks for the
  • introduction
  • um welcome to this enterprise session
  • and and my name is Joan Cal this is my
  • colleague way we both from data mining
  • team of linking analytics as a
  • horizontal team we empower business
  • decisions through data and science with
  • growing adoption of its mark and grid
  • infrastructure support at linking so
  • mark quickly becomes a critical canal
  • technology of many of our data solutions
  • um in today's talk we're going to
  • introduce a representative spark
  • application a intelligence engine built
  • by our team to power linking b2b cells
  • at the largest of a professional social
  • network linking transform of the
  • companies by changing the way they hire
  • mark sales and work majority of the
  • revenue actually comes from our
  • enterprise customers you can imagine the
  • huge potential impact from our b2b
  • intelligence engine in the first part of
  • the presentation I'm going to give a
  • quick overview of the problems we're
  • solving their challenges um and hallouwe
  • architecture of the b2b intelligence
  • engine next on my partner way well zoom
  • in several key components with focus on
  • the key our feature management and one a
  • building platform last but not least
  • we'll walk you through some of the
  • interesting applications and share the
  • lessons we learn a long way so first let
  • me introduce some of the common problems
  • we're solving in b2b al-aulaqi's word
  • from a business point of view there is a
  • two main tasks in b2b sales why is that
  • acquire new customers and the second is
  • the empower existing customers if you
  • look at the entire sales lifecycle a
  • different stage they actually have a
  • different focus at early stage we care
  • about identify opportunities on
  • prioritizing marketing a fruit
  • once a prospect getting to the sales
  • funnel our sales guy you might hear
  • about how likely they can convert this
  • video on and also they would like to ask
  • that the potential business impact by
  • this opportunity for example how much
  • money they're going to spending that in
  • the coming month amount or coming here
  • um
  • and once they successfully convert this
  • deal on a new journey just beginning um
  • and our relation manager will keep
  • nurturing our client on and and try to
  • understand the potential risk of
  • attrition
  • um and what kind of actions they want to
  • take to prevent this happen or they also
  • care about is impossible to sell more to
  • this client to efficiently address this
  • problem how we need to view the
  • predictive models for four different
  • problems and those are not just the
  • problems we were solving at linking we
  • actually have a many different vertical
  • actually have the similar questions um
  • and if you break it down by product it
  • could be even more problem to solve and
  • we're talking about like a 30 40 or even
  • hundreds models need to be viewed so I
  • understand that the increasing um
  • demands we also aware of the challenges
  • behind those problem like many
  • real-world problems data won't be
  • perfect to deal with all kinds of data
  • quality issue is always a challenge that
  • can centralize the repository lack of
  • the source of truth is another
  • challenging to standardize featuring
  • representation we also observe them many
  • solutions um just deliver which is one
  • predicted score without any actionable
  • insights this is very critical for BBB
  • word on and those are considered as a
  • vertical challenges in terms of
  • horizontal challenges we have we have
  • solutions from different teams
  • um some are redundant some are viewed on
  • top of unreliable data sources some even
  • the longer perform well so to unify
  • those solutions is a big challenge
  • internal scalability um to solve one
  • particular problem you really cost a one
  • full-time employee all about like
  • month or even a quarter to deliver our
  • production ready solution so our scale
  • model building a fir and making modelers
  • more productive it's also another
  • challenge to address those mentioned
  • that challenges our mission is to build
  • a one-stop solution that can deliver
  • basing costs intelligence for all our
  • enterprise go to marketing profits in
  • the rest of the route presentation I
  • will show you how we transform an ode
  • our b2b model building on landscape to a
  • new paradigm using spark and many other
  • open source technologies at a high level
  • our system is can cause those three
  • layer solutions on beta layer provides
  • the data foundation of the whole system
  • I'll talk about more later about the
  • data layer we have an intelligent player
  • that provide a data science solution on
  • supporting various supervised and
  • unsupervised learning on top is our
  • application layer where we deploy our
  • models and integrated our results to our
  • client sensing tools here we also
  • highlighted the tools and technology we
  • use the beaut such system as you can see
  • we benefited a lot from the open source
  • community and Hadoop ecosystem we use
  • big and high for our data ETL we use the
  • spark and now leave exci boost for
  • modeling we use up Caban for scheduling
  • and workflow management we use dr.
  • elephant for performance monitoring and
  • and job tuning so just name a few on let
  • me briefly review each of the layer on
  • starting from data layer it gives us the
  • unified view of our other billable
  • features their quality and labels used
  • for the training validation and
  • continuous evaluation in this layer on
  • we have a feature Mart that contains the
  • heterogeneous our entities contains all
  • kinds of on time series with various of
  • granularities and the sufficient length
  • given any training data with time steps
  • we can easily found a corresponding host
  • historical context for that entity and
  • derive a proper
  • representation from that context we also
  • define a standardized the feature
  • onboarding process so that everybody can
  • jointly contribute to new features to
  • this feature Mart and share across the
  • company we also enable a feature search
  • feature profiling on to better for a
  • better discoverability and quality
  • control
  • besides feature management we also want
  • to highlight over here is our label
  • management how you define label actually
  • pretty much defined a way that the
  • problem you're solving right so if you
  • especially if you're you're in a very
  • big organization that requires the view
  • the tens or hundreds of the models
  • manage your labor generation logic and
  • the golden data set it becomes very
  • important on like a cago competition by
  • sharing that asset within your company
  • you can engage meaning in participants
  • with the same interest to improve a
  • baseline solution besides a mature label
  • generation pipeline can also support a
  • continuous evaluation and help you to
  • keep monitoring your model performance
  • and error rate over time so this is a
  • data layer in terms of the intelligent
  • layer is served as the brain of this
  • assistant what we deliver to the modeler
  • is a highly configurable Advan workflow
  • user can specify the location of our
  • training and testing data and and
  • together where it's many other
  • configurations the system will
  • automatically ingest desired features
  • and do another round of sophisticated
  • feature engineering and entering
  • multiple specified model in parallel we
  • benefit a lot from from sparks reach
  • machine learning library and easy-to-use
  • data frame raised
  • ap is sparks on model pipelines and
  • parameter great builder significantly
  • improve the efficiency of model building
  • effort we're also very excited by the
  • new release of the deep learning deep
  • learning on pipelines with all the
  • building blocks arm in place now we are
  • ready to tackle why despair
  • vivy on applications some can be
  • formulated as a regression problem for
  • example the estimated size of price or
  • future spending of their company some
  • can be classified really as a
  • classification problem for example to
  • predict the probability of a winger deal
  • some can be solved by hybrid solutions
  • next let me introduce a way to talk
  • about more about a baby in housing
  • framework and some of the interesting
  • use cases okay
  • don't offer the great introduction so
  • I'm going to take you that a little bit
  • deeper for the platform we did for these
  • p2p intelligent applications I'll
  • basically cover three core components
  • like how we derive insightful features
  • from pending economic growth and the
  • creation of the centralized the data
  • Mart as well as the brains on her just
  • talks about the intelligent engine
  • including learning reasoning and model
  • management and in the end I will cover
  • two case studies and some lessons we
  • learned along the way so everything
  • starts from the data digital fundamental
  • saying that we kind of pay a lot of
  • attention to like number wise data which
  • is also abundant and has a lot of rich
  • meta information associated for a
  • company wise data they are extremely
  • limited
  • sometimes you have to resort to the
  • third party to collect some of the data
  • and sometimes you have to even crawling
  • on the Internet
  • fortunately LinkedIn is a very unique
  • place that actually has a very large
  • economic growth on the side we have more
  • than 10 minutes of company informations
  • and more than 500 minutes of member
  • information and that's not alone we also
  • have the relationships between members
  • and members remember number two company
  • and company two companies so think about
  • these those connections we can actually
  • derive a lot of valuable and exciting
  • missions from those graphics connections
  • so for example the increase or the
  • decrease of the employee within the
  • company to actually reflect the growth
  • of the company on other on other hand
  • those social activities of recruiters or
  • Talent professionals could reflect their
  • usage of our linking product so we have
  • all brought those valuable derived
  • informations to our data Mart our data
  • Mart covers a lot large range of various
  • the data including the derived company
  • level knowledge also some generic number
  • company level information from exert
  • external or internal data sources on
  • different granularities
  • on either monthly or daily we also have
  • different members augmentations for
  • example members are documented by their
  • locations by their jobs by their
  • functions so we can aggregate
  • information to derive new value
  • information for p2p applications all the
  • data have very rich meta information
  • associated and can be traced back early
  • to ensure we do have a high quality data
  • set we have a Model Management feature
  • management component we build which
  • heavily rely on spark and statistical ap
  • is we basically to feature profiling
  • almost learning as long when the data is
  • generated and we also have a
  • visualization system could basically
  • show the results so we can have a good
  • monitoring for the quality of the data
  • on the other side
  • Sonia has mentioned we have a label
  • management system where because we are
  • dealing with different b2b applications
  • and across different verticals we're
  • trying to unify the label generation
  • logic for the same type of problems
  • cross different verticals but for some
  • of the verticals we do have very
  • specific problems so we have unique
  • label generation pipelines for cater to
  • different business use case so let's
  • take a brief look
  • of the intelligent ng we started from
  • the data and after we feed the data into
  • the engine the engine parts going to
  • cover few things and this part have
  • reviewed on part as well as MLG and
  • other open source part libraries first
  • it will do feature engineering we will
  • do feature integration selection and
  • pruning we have in-house feature
  • selection algorithm as well as the
  • pruning pruning of stage because we when
  • we integrate all the features usually
  • there comes thousands of features and we
  • don't want to put some unnecessary or
  • redundant information into the actual
  • modeling learning stage so after that
  • it's very typical we do train validation
  • data sense generation for
  • cross-validation the tuning purpose and
  • after the data is generated we fit them
  • into the model building block and we
  • also have a model interpretation which I
  • will talk about later on after deployed
  • to the production side we have this
  • model management block which is
  • responsible for both monitoring the
  • features as well as the models so we
  • always keep an eye on those model
  • performance when the model when we
  • notice there is a model degradation we
  • we will kind of do the riff model
  • refreshing so make sure that every model
  • is keep up-to-date so for the modeling
  • part this part heavily rely on the part
  • ml lip and the external machine learning
  • libraries for example extra boost
  • it covers wider modeling approaches and
  • very fast
  • we have also explored the different
  • other learning algorithms and since I
  • mentioned we are also in kind of
  • interested in brain deep learning to the
  • between market but there could be some
  • assault you have to put before bringing
  • and your algorithm to this b2b we have
  • really limited the data set to use the
  • algorithm can do with regression
  • classification and
  • pastoring and we have build a hybrid
  • system that could do this kind of
  • multi-level parallel learning to use
  • different to do with different
  • formulation of the problems so with that
  • we have really fast at iterations so we
  • can see the results and to validate it
  • by our sales reps and getting without
  • back and do another round of modeling
  • the the front API is very self served so
  • you can do a lot of different
  • configurations you can chew on a
  • different view on grain so it provides a
  • lot of flexibility flour in how internal
  • customers regarding the evaluation part
  • for this model building we basically use
  • a very standard machine learning
  • evaluation approaches but because this
  • is today's application we do have
  • specific business requirements so we do
  • have in-house specific evaluation
  • approaches that kind of aligned with the
  • business needs so next I will talk about
  • the model interpretation and the model
  • management so after the model building
  • we often be asked a very critical
  • question which is why like everybody was
  • saying okay machine learning it's black
  • box and but sometimes we just being
  • asked ok you have to open this box I
  • want to see what's inside so what we do
  • is we have to allow the research on this
  • part we see what has been used commonly
  • in industry settings for example by
  • calculating the coefficient of if you
  • have it in your model or you are
  • extracting feature importance from some
  • nonlinear model trying to kind of
  • explain what the model is doing and why
  • a mother is giving a certain data point
  • a high score another data point a low
  • score but we think that's not kind of
  • enough and also we cannot really answer
  • some of the questions to our sales reps
  • so what we did is that ability to hire
  • radical
  • multi-view component models it's kind of
  • simple idea whereas you look at
  • different features and you put the
  • clustering algorithms there and we're
  • trying to cluster all the features into
  • some higher etiquette structure and with
  • that structure you extract groups that
  • have similar semantic Fitness meanings
  • and group them into certain buckets
  • within each bucket you're going to build
  • a model that kind of representing the
  • meanings in that dimension for example
  • in this case we have a component that
  • representing market represent engagement
  • represent growth and social so when a
  • new scoring data point company we apply
  • the master model or the mass model going
  • to estimate overall score for example
  • conversion probability as a centralized
  • score the component models will also
  • apply them into this data sample and it
  • will predict the strengths on each of
  • the dimensions in this example we can
  • see it actually very weak on the product
  • usage a dimension so then we can dive
  • deeper into a finer level to see okay
  • what caused the custom do not have a lot
  • of engagement and we see okay
  • the monthly activity seeds are low and
  • average the meu digits low so those
  • actually very good insights to give our
  • sales reps - for them to work with so
  • they can have action on item that ok I
  • have to you know we do the training I
  • have to maybe talk to our custom to see
  • what is going on why they're not using
  • our product right so this kind of helps
  • us to explain what is happening besides
  • the overall score and make sure that
  • they can consume our output rather than
  • just take the number right so modeling
  • and the interpretation is just a part of
  • the entire development circle cycle is
  • not it's never the end because your
  • vision is
  • it's evolving and amico lay your
  • products after is frequently so
  • everything is changing so basically both
  • feature and the model has inhering the
  • temporal temporal nature and we have two
  • particular monitoring them as it changes
  • over time what we did is that we have
  • this management or monitoring component
  • that is doing feature
  • profiling as well performance
  • performance monitoring once we saw a
  • performance degradation we're going to
  • retrain a model we have to click collect
  • new training data to generate a
  • challenge model to compete with the
  • older model to make sure model is
  • up-to-date we pay especially attention a
  • lot of tension analysis to those failure
  • and outlier examples to see what is
  • going on if any data can explain for
  • those failure or or or outlier examples
  • we also compute the future profiling at
  • using basic statistics so we can see
  • what is going on whether there is a
  • trend of the feature whether there's
  • something happens because of the
  • upstream data has changed if there is
  • any change that is significant when we
  • call the previous build model and we
  • think about other strategies to
  • rebuilding the model of course we're
  • going to do some feature diagnosis so
  • next I will take you to some of the case
  • studies we did using this engine p2p
  • solution it's a actually big platform
  • these two problems it's kind of very
  • representative the first one is account
  • propensity score which is author in
  • question for which enterprise account
  • have higher chances of buying a product
  • and one it's focusing on the new
  • business customers the second problem is
  • up observe propensity which is trying to
  • answer which existing enterprise
  • customers have observed potential this
  • part focusing on the existing customers
  • both of those problems have something in
  • common
  • for example it varies a lot across
  • different regions like
  • amania or American or Asia and different
  • across different products for linking we
  • have self solution we have other
  • products right and also data evolves
  • dynamically this is saying for almost
  • all the problems we have been facing
  • both of them both of them can be
  • formalized as a binary proper propensity
  • model but these two has a little bit
  • differences for example for account
  • propensity model we have a lot of data
  • to score but the data is very sparse
  • because we don't know much about the
  • company and company entity right it's
  • very noisy and some of the regions have
  • very very small training data set women
  • have not much on Asia market or some
  • other markets so for those regions we
  • have to leverage some information from
  • the other region and for the score
  • accuracy evaluation part we need it's
  • actually important for the whole
  • spectrum meaning the lowest and the
  • highs are equally important in contrast
  • for upsell the accuracy is more
  • important for the top ones like you're
  • building a recommendation system you may
  • be more focusing on the top recommended
  • one straight and also for the upsell
  • because they are existing customers the
  • data is much more work trustworthy has a
  • lot of rich information especially
  • product usage features so the important
  • things are another ok so I'm going to
  • talk about the labor generation so the
  • feature Marty is there was built with
  • thousands of thousands of features and
  • the label is another counterpart that is
  • equally important for account propensity
  • label is defined as the account and
  • region level we think the close one are
  • the ones are positives and the closest
  • engage other negatives but there are few
  • texts with tick with the first thing is
  • that the
  • a label is defined at the time when you
  • have a close one
  • the opportunity's finished but the
  • feature associated with which is you
  • have to look back from the opportunity
  • creation time not at the end of the time
  • because you're always predicting the
  • future the next you have to find
  • explicit negatives so what is explicit
  • negative so when you are building the
  • model usually you build mode at the end
  • of year for the next year proud
  • prioritization you're looking back for
  • the whole year's data and there are
  • cases where there are opportunities you
  • reached out once and they it was a
  • failure and there are cases the rep has
  • been have been reaching out for multiple
  • times and all the time it's a failure so
  • those are the good quality negatives we
  • call it explicit negatives what we have
  • to exclude or pay attention to is those
  • ones we call the flipped opportunities
  • which are the try on your own meaning if
  • you're looking back for the whole year
  • data you see okay this website has been
  • reaching into these customers multiple
  • times and only the last time it's a
  • closed one and everything before that is
  • a failure so for this case we still
  • think it's a positive but we do exclude
  • any failure before from the training
  • data set as well as the feature that
  • associated with with with these data
  • points so label generation for the absol
  • model is a little bit similar but also
  • different for label for Upsilon labels
  • define as account and product like
  • different linking product level but
  • we're treating two scenarios differently
  • than merge them one scenario once in a
  • root is called a de and another one into
  • renewable true for renewal is a little
  • bit too singular so when you have a
  • contract you look at the renewal letters
  • if it renewed with more money comes in
  • using it positive if it renewed with
  • equal money or less then it's negative
  • for add-on it's a little bit more
  • complicated things we focusing on the
  • accuracy for the top recommended ones so
  • we really need to find good quality
  • positives so when you think about add-on
  • cases you have to also consider bowel
  • consider their term or renewal status so
  • for renewed accounts we look back to see
  • whether there is increased sale or
  • decrease the sale for increase so we
  • think it's positive for decreased so we
  • think it's negative but for any contract
  • that is actually turned on after the
  • initial contract no matter whether there
  • is increased or decreased cell with just
  • in case negatives so this other level
  • generation part I'm going to skip the
  • modeling details because whatever the
  • modeling is doing it's kind of going
  • through that flow and we do have a lot
  • of iterations going on so that's too
  • much for this presentation I'll show you
  • some of the results as well as
  • interpretation results so the two tables
  • here one for the count propensity scores
  • well for the up fill the first the first
  • line of each table is the master score
  • the the line the rows underneath are the
  • dimensions or we called the component
  • score that trying to explain what is
  • going on for that master score for
  • example for account propensity score we
  • identify that account to be as fast
  • growth that a good indicator reflects
  • potential demands so when we tell this
  • to lower reps it basically to our reps
  • okay you need to find out how our
  • products going to help our customers and
  • maybe you can help them to growth a to
  • to further grow and increase social
  • selling or affinity
  • on the other side for up there we
  • identify accountancy as less engaged so
  • that means we have to provide maybe
  • better onboarding or training to our
  • customers to make sure that they can use
  • our products a little bit more and
  • prevent the future churn
  • so regarding business impact we have
  • applied these two models in both
  • marketing and the prioritization for for
  • example just some q1 results for the
  • accounts propensity we have 5.3
  • when we post compared to no
  • prioritization use in this case and for
  • product up so we have about 3 to 3
  • points 3 click-through rate compared to
  • baseline modeling and we did generate 20
  • to 40% more opportunities with customers
  • so overall what we are trying to do over
  • here is turning the chaos of data and
  • modeling into a straight forward and to
  • end powerful b2b solution what we did is
  • we have built a centralized large-scale
  • data set together with intelligent
  • engine on top of it and it helped us to
  • speed up the entire modeling cycle so we
  • have fast iterations and we it has shown
  • a significant business impact so that's
  • about it and thank you very much
  • if you have questions feel free to ask
  • and we are hiring so if you're
  • interested to talk with that thanks a
  • lot thank you we're open up to questions
  • anyone water
  • you may have covered this but I was when
  • you talk about the component and the
  • master are those is the master score
  • some of the components or is it just
  • putting in all the attributes I mean are
  • there is there a correlation between
  • those two concepts or is it separate run
  • in parallel that's a really good
  • question so we have the same thoughts
  • when we build this kind of thing so
  • these two are the master model and the
  • component model built independently but
  • afterwards we do have a calibration
  • stage actually trying to match the
  • master model together with the component
  • model so I didn't talk about the
  • calibration because it's a little bit
  • more complicated
  • the calibration also has to kind of
  • sometimes it's also differs from
  • different business use cases for example
  • some of the businesses require dinners
  • summarization like they just want as
  • simple as possible other business
  • require okay I need more accuracy so we
  • may actually have additional modeling on
  • top of those output of that contra
  • yogurt they were trying to independently
  • actually using the exactly the same
  • label set but even different feature
  • sets so you can consider component model
  • as the estimation of the master score as
  • well
  • yes yes that's correct yeah usually
  • usually if you see the distributions of
  • the data points like even if you're
  • using like just adding them together you
  • see the distribution kind of overlay
  • with each other for the majority of the
  • part you do have the out like the long
  • tail for some of the counts and that's
  • inevitable because for one thing you
  • have these long tail estimates for some
  • of the counts even in the master score
  • so there's no way that we're using our
  • subset of features can capture you know
  • whatever in the whole feature data set
  • you know anywhere else so the profiles
  • are linking largely individuals
  • professionals right and here the skulls
  • are actually account at the con level
  • sometimes a team level how do you
  • actually aggregate the behavior data
  • from individuals to account so if I
  • understand your question clearly here
  • you're asking how do we aggregate the
  • individual member level features into
  • the councilor because the county level
  • is at the company level right so so for
  • example there our connections like I
  • talked about on the LinkedIn economic
  • graph you have this connection with your
  • company for example you may identify
  • yourself as employee of certain
  • companies right and within this company
  • within LinkedIn this company we have all
  • those affected employees and each day we
  • have different activities and those
  • activities are aggregated for example we
  • have to take call so the most simple
  • ones are just basically averaging but
  • other cases would be focusing your job
  • functionalities like developers may have
  • their developer activities and it may
  • reflect whether these companies are more
  • like internet kind of company IT company
  • or some other type of company
  • any more questions so first of all very
  • good presentation thank you a lot of
  • insight so I have a question on your
  • component models and your overall models
  • so when I see your last line you have
  • the component component models as a high
  • lower medium for each of component but
  • if your overall score is really high
  • like 95 95 or 950 talk about that number
  • is that for example if your overall
  • scores are 92 which has most of the
  • components by high high high high high
  • high and the other one 570 it has a very
  • similar component score how would you
  • explain those to your business user so
  • this is kind of example of example you
  • could impose so it's not the actual data
  • for actual data we do have those
  • problems like the first question asked
  • so we do have a calibration stage after
  • this which we'll look at each of the
  • components because see if you just
  • looking at the to estimate no matter
  • whether its component or master if
  • you're just looking for two estimates
  • from two different machine learning
  • models
  • how you gonna kind of align align them
  • there is actually no direct way to align
  • them right so we're kind of trying to
  • align them within with each with the
  • each of the model and for example if we
  • are breaking accounts into tears for the
  • master model we also have those peering
  • under each of the component model and
  • those tearing helps us to align the
  • component scores versus a master so you
  • can see the generated attacks so the
  • actual feature is actually just a some
  • numerical numbers right but the text we
  • generated is for the benefit of the
  • sales rep to kind of understand okay
  • roughly what levels or what status that
  • account is at
  • or if no other either thanks so you have
  • a number of activities that you can
  • associate with with support companies
  • but the goal is really to associate the
  • activities with an opportunity so what
  • happens when you have multiple
  • opportunities with the same company how
  • do you know which activity goes to which
  • opportunity within the same company um
  • so there are two type of activity or
  • behavior features we use we empower
  • those models why is the general linking
  • member activity like how how they do the
  • search how many searches will happening
  • in the last month
  • like how many job search how many people
  • search and product search those can be
  • easily added to the account level and
  • and empowering those individual arm
  • opportunity right so each of the
  • opportunity type to account
  • it's a one-to-many mapping at the same
  • time an opportunity advised on the sales
  • people will engage with individual
  • opportunity and we do have another set
  • of opportunity or any behavior features
  • are available in our future market so
  • we're leveraging both the type of the
  • design behavior features to do the
  • modeling
  • all right it's just safe can't ever know
  • five so I liked your question on the
  • opportunities I wonder the so if you had
  • a company that let's say the company is
  • a global company and you have that
  • activity happening in France then you
  • have that activity happening in the US
  • and there's two opportunities but is
  • that activity how do you actually
  • associate it to the to open
  • opportunities if it's within that same
  • account if that makes sense I think I
  • mean I've struggled with this at four
  • different b2b companies because I don't
  • know if you're using Salesforce data but
  • often you get a lot of that activity
  • data associated to the account but not
  • the specific open deal so maybe it's the
  • data structure that you don't deal with
  • it's I'm not sure if that's what you
  • were referring to
  • up in the front there but this is a
  • question make sensor can I restate it
  • yeah we do have a case of like one
  • international company have different in
  • different region um and and decides to
  • like consider this entity as a just a
  • one global account we also can identify
  • their different branches around Ward and
  • they identify the the employees
  • associated with that branches right and
  • aggregated feature behavior features to
  • that individual on branches so that we
  • have a several layer of aggregations
  • happening and all like can generate a
  • bunch of the features and be used by the
  • model okay yeah maybe a better maybe
  • it's another way that triangulate the
  • question would be do you often see
  • multiple open opportunities under the
  • same account for example within one of
  • those branches or is there usually one
  • open opportunity for one branch
  • definitely they're there multiple
  • opportunities available over there on
  • and it really depends on the D'Amato's
  • like what kind of entity you're talking
  • like for example the the APS model
  • personal model like we we were
  • specifically talking about accountable
  • entity so on like a account my
  • associated with multiple opportunity but
  • all the other activity or labels or will
  • be aggregated to a council level but on
  • the same size like you were way talking
  • about opportunity to level like term
  • model and actually what we try to
  • represent is just it's very specific
  • like an opportunity like we're morning
  • used the feature representation for that
  • entity right okay thank you I mean sorry
  • if I'm let me do the training maybe the
  • question gets all the way down to the
  • Train when you feed the training data
  • set your label is a closed opportunity
  • I'm yeah I mean we're building a similar
  • model right now I'm guessing you say
  • find the account under which that
  • opportunity closed and was one and then
  • take all of the preceding activities
  • within that account is maybe one way to
  • kind of make sure you're passing all of
  • the right features for that opportunity
  • that make sense maybe a whiteboarding
  • session so we'll take it off oh yeah
  • this offline I'll see if I can connect
  • again clarify Thanks segment that you're
  • down opportunities for filtering about
  • us so alright thank you everyone that
  • was a great session and let's give a run
  • across to thank you

Download subtitle

Description

"B2B sales intelligence has become an integral part of LinkedIn's business to help companies optimize resource allocation and design effective sales and marketing strategies. This new trend of data-driven approaches has “sparked” a new wave of AI and ML needs in companies large and small. Given the tremendous complexity that arises from the multitude of business needs across different verticals and product lines, Apache Spark, with its rich machine learning libraries, scalable data processing engine and developer-friendly APIs, has been proven to be a great fit for delivering such intelligence at scale.

See how Linkedin is utilizing Spark for building sales intelligence products. This session will introduce a comprehensive B2B intelligence system built on top of various open source stacks. The system puts advanced data science to work in a dynamic and complex scenario, in an easily controllable and interpretable way. Balancing flexibility and complexity, the system can deal with various problems in a unified manner and yield actionable insights to empower successful business. You will also learn about some impactful Spark-ML powered applications such as prospect prediction and prioritization, churn prediction, model interpretation, as well as challenges and lessons learned at LinkedIn while building such platform.

Session hashtag: "

Keywords

apache spark spark summit

Tags

#SFent8

Popular this week

Related videos

TheTV in 90 countries


16 main categories