Archive for the ‘QueryFormation’ Category

I’m deeply in love with Western Massachusetts.

I can tell it’s love. Because given the choice between burbs and cities I can’t abide either. This no-choice-at-all is called Pioneer Valley (and that’s the path I’ve chosen). The connection runs deeper than the infatuation of floods or famines. It’s given me more than any lover, credential, or birth rite can bear. That’s the simple embrace of community.

It’s a zagging of offshoots. It’s a smattering of vocal opinions beholden to even louder lives in motion. The Valley residents that inspire this love don’t just think something would be a good idea when they get around to it but make the time. I hear the words, see the actions, sense the community, and feel a group participation number coming on. And it sounds like this:

  1. We can’t make the money here but we can make the time — and the effort to engage the neighbors, encircle the orbits, and divide up the work.
  2. That’s the thinking behind helping Valley folks become more “Knowledge-ABLE.”
  3. That’s the term defining the web-based research skills and actionable outcomes convened by the Society of Useful Information.
  4. We meet each Thursday evening in the Sawmill River Arts Gallery at the Montague Book Mill.
  5. Click here to sign up for an upcoming session.

Who are the de-facto members of this Commonweal? They’re the local educators, students, small business owners, web designers, social workers, and policy advocates. They don’t need more connect time but better connections to their own research, customers, funding sources, and the powers to size up the stature, positions, and end games of the folks they’ll be seated across from in their next business trips, job interviews, and power negotiations.

One of the other benefits of my Valley allegiances is that it only takes unmasking those affections in public for the media to cover these community-based meet-ups. Last week I landed on the front page of the Greenfield Recorder and Daily Hampshire Gazette’s reprint of the same piece. And I didn’t have to hack into any phones or call in any favors. The reporter Chris Curtis did a stellar job of recounting the hit-and-miss trial-by-errors of “the founding and sole member” of the Society for Useful Information. What could be more ground-breaking and less pressing than teaching to an impartial observer and “publicity-shy Amherst area consultant?”

As lifelong friend and chief rhetoric connoisseur Terry Canade remarked later on email:

“Your writer/editor was good at distilling quotes which sound like you, move the story along, and intrigue the reader.”

The proof in Canade’s assessment lies in last week’s attendance. I drew double digits — practically standing room only (and none of them repeat members). None of them were entirely clear on what to expect but they all showed up assured that the full cost was absorbed by the gas eaten up to occupy “a place that’s impossible to find.” Two hours later they had a grasp of the virtues of…

  • Social bookmarking (exhibit: Delicious tagging and commercial-free search)
  • Getting from a search ocean to a proprietary pond (exhibit: child molesters and malpractice doctors)
  • Visualization tools (exhibit: connecting the vectors between events and stakeholders through Silobreaker and Muckety)
  • Word algebra (exhibit: query formation techniques that combine a handle on semantics, syntax, and search operators)
  • Credibility factors (exhibit: Site Explorer to compare self-directed and externally triggered attention)
  • Timeliness (exhibit: the /date slashtag in Blekko search)

I was exhausted by the end of the session. I was compelled to do most of the talking because I didn’t have prior knowledge of attendees search projects or the class size. I learn lots more when we can move beyond lecture and offer the group therapy benefits that come with articulating our silent confessionals to Google. Bringing voice to the discovery process is far more rewarding that the best engineered web destination. To paraphrase Karl Weick (or perhaps E.M. Foster):

“How do I know what I click on until it clicks with those I’m searching  (or reaching out to?)”

We’ll know when curiosity leans forward and attendees find themselves returning with theories to test. That’s when the class will find its voice. That’s when the classroom will evolve into a round table and participatory search will become the key to self-education online. For now, crowding around the communal browser is a throwback to the early days of TV. Soon perhaps, it will be a leap forward to a collective experience that is anything but placid or unquestioning.

Just like the community of my affection.

How do we know it’s time to suck in the bulge around our brains?

The marketplace for ideas need not be a storefront

That swelling is not transmitted by airborne virus or insect. Most likely it’s the encrypted WIFI signal ready to auction off our attention based on:

    • Who wants it,
    • Where they will be for the next 45 minutes, and
    • a coupon for perishable inventories echoed in the placeholders of past browser sessions.

Like any internet marketer Google’s efforts are focused on converting web surfers to online shoppers, translating our calls for help into online ads it sells to the market. This arrangement connects some sellers to buyers but does little to resolve our need for quality information from credible sources. We must fend for ourselves when it comes to online investigations. And that isolation compromises our effectiveness as learners:

  • The web by remote control: Google IS the internet. If it doesn’t rank, it doesn’t exist.
  • The priority switcheroo: What landed last is what deserves to be acted on first.
  • The sourcing dilemma: The motives behind information providers and how that squares with evidence used to substantiate their claims.

The simple truth is that Google has a great deal of value for the researcher — little of it that’s monetizable. Ironically Google’s unrecovered assets are where web researchers should be investing big time:

  1. How to assess: frameworks and models: Aligning knowledge-seeking requirements with an informed sense of where to go and for acting on what we’ll expect to find there
  2. How to ask: semantics, syntax, and operators — The building blocks of query formation; the act of interrogating databases and training them to do our bidding
  3. How to act: context and meaning: Applying source fluency in order to scope the credibility, authenticity, and ultimately our understanding of where our findings are leading us

Internet sources and the search results they spawn are dynamic, conflicting, and open-ended. Your time commitments, however, are not. We have set objectives and hard stops for reaching them. Too often our need for closure and certainty suppress our hunger for learning. Connecting poor search results to preferable outcomes is like trying to shape up on a diet of lard and donuts. No matter how many searches and site results we retreat from the computer with more uncertainties and less time to settle them.

There are two reliable reasons most of us go online to research:

  • Learn enough to act on what we’ll learn
  • Reduce the uncertainty around what those actions may bring

These are the themes and objectives we’ll be exploring each Thursday evening this summer at the Montague Bookmill. We hope you can join us for our weekly meetups from 7-9 pm. The series begins next Thursday, June 2nd. We look forward to your input and addressing the pending research projects that fire your desire to learn in the first place. A laptop is optional but do pack your own homegrown research problems.

And be prepared to share.

"When you use more than 5% of your brain, you don't want to be on Earth."

I’ve probably teared up more at an unfair hockey fight and I’ve had more emotionally-engulfing movie goings. But as far as a life philosophy that plays out on screen, no self-contained cinematic mythology holds my candles quite like Albert Brooks’ Defending Your Life.

For most of the story Brooks’ day of judgement is about to play out in the purgatorial trappings of a Disneyesque  lodging and office complex. Is the protagonist to advance on the eternal enlightenment path to some higher plain? Will he shuffle back on the next tram for a return date with “the little brains” on earth? That swipe at us live inhabitants is a line delivered by Rip Torn, Brooks’ defense attorney who testifies to a 53% utilization of his own cranial capacity. Us little brains use 2-3% — the remainder of our mind-shafts are crowded out by lethargy and fear.

When One Framework is Worth a Thousand Taxonomies

I’ve wondered what fears could be confronted and ultimately shed so that I could soar perhaps from 2-3% up to 5-6?  In that spirit I’ve recently stumbled across a framework called Bloom’s Taxonomy. Like the defense lawyer slam at our low-performing mental capacities and fear-mongering, Bloom said that 80-90% of our highest brain function in the lowest realm of sense-making. He calls this “knowledge.” Knowledge is accessed through the following retention portals:

remembering, memorizing, recognizing, recalling identification, recalling information, who, what, where, when, how, describing

Kinda oafish, no? It’s deciphering 101. It’s on or it’s off. X=Y or fuggedaboutit.

The pattern-matching of keywords is not the face that launched a thousand ships but the probability gag that seated a thousand monkeys at their typewriters in order to write the great American novel or the great American Internet start-up — what ever cashes out higher.  Just Ask Jeeves! These are the well-trodden grounds of that cloistered chamber you and I have come to know as web search. Its premise is still tuned to exact match good enough-ness. That’s because we can be sold nouns even more easily than the notion our mental blanks are being filled in my omnipotent language engineers. We frugal consumers cave to deals on things — not to actions about ideas. Nouns are the merchandise — not the verbs that help us to backorder our understanding of what we actually do with our bill of goods. Unless we’re potential suspects in a case, no one is interested in our trail ‘o stuff — unless they can sell it to us again.

The next order of mental processing is to isolate noun phrases from their predicates. That means getting the search engine to distinguish actors from their actions, reducing outcomes to a range of questions we’re ready to answer — or at least lower our surprise should they arise. That kind of conditional logic exists in our mental reflexes whether we’ve had our morning shower or coffee.

It’s interesting that in the pecking order of brain function the inverted pyramid of journalism ranks somewhere in the custodial closet of the ivy-coated shrines of higher learning. Not incidentally these are the unremarkable terms on which IBM’s Watson, the question answering machine, beat its human Jeopardy contestants to the buzzer. It took a fact base so bottomless it would turn baseless in the gear shafts of the most fervently applied quiz show savant. Watson’s algorithmic swagger chewed through mounds of trivia like a smoldering ash heap of documentation fertilizer.

Elementary School My Dear Watson

The conquest prompted one of the IBM partisans to reflect in the New York Times on finding Watson more meaningful work:

“I have been in medical education for 40 years and we’re still a very memory-based curriculum,” said Dr. Herbert Chase, a professor of clinical medicine at Columbia University… The power of Watson- like tools will cause us to reconsider what it is we want students to do.”

At the same time Watson’s next gig as a physician’s assistant begs a more immediate question: how do we humans need raise our learning games to Bloom’s next levels of comprehension, application, analysis and synthesis? How do we aid and abet the healthy transfer of between us inquiring pea brains?

Knowing a lot about an academic discipline is at best, tangential to teaching it. Having a natural understanding of a subject can be an unnatural fit for passing that understanding along to others. Assuming that academics are better at publishing papers and attending conferences than in educating students, the question falls to the insatiable learners among us: how do we teach ourselves on a level beyond the aspirations of Watson’s parents? How do we convince supple, young minds that a healthy dose of skepticism about humans is only the first of a storehouse of rational and instinctive reasons to doubt the merits and intentions of question answering machines?

The current cover story of the Atlantic Monthly offers up Mind Versus Machine. Here science writer Brian Christian serves in the oppositional role of the two Jeopardy adversaries to Watson. The objective of the annual Turing Test is for AI (“artificial intelligence”) programmers to convince a sequestered panel via screen text that a machine could out-human its creator in a range of topics spanning from “celebrity gossip” to “heavy-duty philosophy.” The advice Christian was given when cramming for this contest?

“Be yourself.”

Gee, and I thought I knew how to body surf with the more cryptic sharks!

Five minutes of IM messages later Christian was crowned the winner of the Most Human Human Award — chiefly for two reasons:

  1. His dominating volleys (he’s not waiting on Alex Trebek to pounce, pry, or provocate)
  2. His insights into how the bottom feeder knowledge spoon-fed to his AI adversary highlights natural human intelligence in the experiential realm:

One of my best friends was a barista in high school. Over the course of a day, she would make countless subtle adjustments to the espresso being made, to account for everything from the freshness of the beans to the temperature of the machine to the barometric pressure’s effect on the steam volume, meanwhile manipulating the machine with an octopus’s dexterity and bantering with all manner of customers on whatever topics came up. Then she went to college and landed her first “real” job: rigidly procedural data entry. She thought longingly back to her barista days—when her job actually made demands of her intelligence.

That’s a lesson well worth reteaching ourselves the next time we find ourselves needing to justify more question/answer sessions scheduled in the upper eschelons of Bloom’s taxonomy.

Photo by Jim Henderson | jim@henderson.org.nz

“Never mistake your presence for the event.”

- Roscoe Lee Brown

I’d like to say that the highlight of Tuesday’s Open Mic Night was that I got to teach what I love in the manner I love doing it (teach). I’d like to affirm further that there was an airy effortlessness to the presentation. After all this was a captive, active group — engaged, smart, skeptical — all the requisite aptitudes. Finally I should clarify that the staging was in a stately conference facility with a robust wifi signal, no dial-in audience to accommodate, and most importantly … no institutional middle man.

Truth is, the most gratification came from assembling two Boston-based information communities — my PI/detectives and fellow SIKM colleagues — then watching the collaboration fly across the conference table. The joy of discovery is one thing. But sharing that joy is pure rapture.

Kirstie Fiora filled in for my woeful event-planning deficits — ushering in attendees past locked front doors and assorted roadblocks that escaped my logistical skills for bringing people together. I’ve been living off her Angie’s Kettle Corn snack offering since the session ended. Pathetic.

In addition to Kirstie and brother Gordon, my improbable roundtable for round one included:

  • Ann O’Connor (PI) — Researcher, International Brotherhood of Electrical Workers (“IBEW”)
  • Carrie LaRose (PI) — Comptroller, H&H Delivery
  • Dave Wallace (KM) — Managing Partner, GameChange LLC
  • Joe Cadillic (PI) — Private Investigator, Murphy & Associates
  • John Dalli (PI) — Owner, Worcester Record Search
  • Kate Pugh (KM) — President, Align Consulting
  • Paula Cohen (KM) — Knowledge Manager, Information Enterprises

The premise of the News Radar theme was that we’re fussing over stuff that should not rise to our radar levels. I demonstrated tools for flushing information waste products back down below the sewer line where they belong. On the attention scale we rake our mental bearings into three piles:

  1. PURPOSE: What makes our lives worth living
  2. OBJECTIVE: What do I do to make #1 happen
  3. DISTRACTION: What gets int the way of #2

Way too much of the web is sheltered under the growth of pile #3. That’s where I introduced the pruning shears of semantics, syntax, and search operators for pruning away the rubbish. I also tried to give them some bearings on when a scarcity of information actually calls for expanding the boundaries through keyword cultivation or the simpler queries that create more productive outcomes in ponds (specialty databases) in lieu of oceans (commercial search engines).

These examples were modeled on the saving graces of an XML-centric approach for having useful information find us — assuming that web searches don’t fall in the #1 pile camp. We used syntax to forge for RSS feeds. We used semantics to develop some word algebraics for trapping some common corporate event triggers (marketing, finance, regulatory, musical management chairs, etc.) We even set up watch lists to ensnare specific people we might call on that we can insinuate by the job titles in the announcements of their promotions.

We tested the relationship of push and pull sourcing by using Site Explorer to compare self-referring links to external ones as an even-handed basis to gauge the credibility of information providers by their web domains and pages. Finally we went shopping for higher level concepts like citizenship in Google Keywords and came away with the humbling conclusion that “Citizen Watches” were likelier to tell us Google time better than the Bill of Rights. Another discrepancy worth noting is the number of searches generated for terms versus the ad dollars they fetch, e.g. “SAS Document Management” yielded 73 searches last month despite its “competitive” appeal to Google advertisers.

The one underwhelming demo I thought was the section on rolling your own custom search engines. We grouped media sources back into their traditional pre-web categories. Remember the term “paid media” to describe the success of 20th century PR campaigns?” Didn’t think so. The disappointment was in the lack of evidence that the keyword refinements did much to skew the results or tell the underlying story of the custom search theme. I also flew over the information pond completely. That means I didn’t focus much on using Google as the search engine of record in order to qualify and build contact lists from social media sites.

I’m looking forward to the next several sessions in Western Mass and will try to localize them. Candidate projects on the agenda? Building alumni lists from LinkedIn of Five College graduates.


The choice of text versus numbers is starting to ring false. The trade-off between relational tables and keywords is no longer a stretch or a compromise. The missing ingredient isn’t the optimal content database or the more responsive search tool but the outcomes that live in the cross-hairs between traditional BI and conventional keyword matches, and what began many formatting standards ago as decision support.

The purpose of SearchBoards is to classify content on a granular level. The goal is not panning for knowledge gold but to scratch the itch that prompts the question. Searchboarding doesn’t retrieve articles and files, Search Targeting informs what happens next. As Judith Jaffe, Knowledge Manager from the Risk Management Foundation put it in yesterday’s Boston KM Forum it’s to embed interventions into workflows. It’s us knowledge workers reconfiguring the juggernaut of documentable consequences. In English that means indexing spreadsheets so that the nuggets are discoverable, process-specific, action-based, and quantifiable as assets.

The counting goes beyond raw first and secondary wordcounts inherent in typical SEO analytics and goes to a tender info fantasy older than any taxonomic model. That’s flipping on a switch and having the proposal auto-generate or the diagnosis nestle in a warm bed of evidence. There’s a problem, a set of case tables, and a battery of check boxes. No one is left holding the word bag.

This is a good thing because it takes the conversation away from hit counts and page ranks and into the more tangible matters of solving problems and completing tasks. It’s not about capturing insights — yawn. It’s about the rich conversation between what we’re working with (data sources) and what we’re working on and against (projects and deadlines).

Another promising development is that when our data sources are bullets and talking points, we remove the ambiguities that are full-time occupants of Planet Google. And those doubtful citizens answer to a toppled leader called “intention.” And the lingua franca of intentionality are particles of speech. They disappear with SearchBoards. That’s because SearchBoards eliminates the source of the ambiguity — that troublesome middle man between all causes and effects called the predicate. It’s problematic because predicates are the nerve endings of human logic and they fall apart completely at the mercy of search technology.

And those search engines are as good as teaching how futile this is as they are abysmal at overcoming their own limitations. We’ve been trained well to keep our expectations low. Witness a Stanford University study cited by yesterday’s forum speaker Mark Sprague that suggests 2.4% of all search terms include verbs. No small wonder we have no idea what to do with our global information surplus.

Another tedious argument that goes away here is the Coke vs. Pepsi piss-off that parallels taxonomies and folksonomies. The liberation here is that common meeting grounds like “results” or “teams” or “industries” lend themselves to pattern-friendly sets of finite values (classification schemes). Other more fluid fields like “results” or “objectives” remain open-ended. But the rich variety of how those stories play out become the bucketed narratives on the SearchBoard results queue.

Finally the biggest payback is that we get to keep serendipitous top-of-mind association. Was there ever any doubt? And we can still bask in our most enduring content structures. What’s there not to like when the only thing we have to Google is Google itself?

In 22 years of being online I can tell you the number one time waste is guessing that the person you’re searching on is really the individual who keeps coming up in your search results.
Even if you know where they went to high school or their middle initial, the incomplete details of a partially formed profile can open up more doors than it closes.
I created a confidence ratio for my PI students to gauge the accuracy that my pre-class Googling of them was really then. I told them that I put together this puzzle to show them:
  1. Some examples of semantics and operators — two of the components comprise the work we do in query formation
  2. A fact-based way to gauge the likelihood that they’d nailed the right guy
But beyond the math and science there’s also a lot of frustration spent on the nailing — the obsessing over whether we have the right guy or not. In an investigation where our fact base is limited to the actions of one suspect then there is little choice.

I’ve seen, all too often though, that investigators tie themselves in knots because they don’t allow for a range of outcomes that includes several possibilities — be it witnesses, experts, interpretations, or even competing explanations for why the crime occurred.

I tell them that as we get deeper into the Internet realm of criminal research a range of productive outcomes is a lot more realistic (and healthier on your heart) than fixating on one suspect and nailing them to whatever … they deserve.

Oh yeah, there’s one other reason I dragged them through this. Second biggest time waste on the web? It’s not Britany Spears or Michael Jackson. That’s right — it’s vanity searches. And until we either make introductions or Google one another, that’s all we have to go on.

You’ve heard it before — especially in a public setting seeded with unfamiliar faces: “There are no stupid questions.”

Mostly the moderator who says this is responding to a lack of feedback — especially when the presentation they gave is either alien or controversial to at least some of the participants.

In all honesty the stupidity lies with the moderator for boxing themselves into an exchange-proof presentation. But if we were even more honest about the kinds of questions that drive search analysts and KM folks batty it’s a misinformed question built on the premise of unfounded assertions, urban legends, and generalized assumptions that stretch the appropriateness of their fit too far.

For example it’s entirely understandable that some rocket scientist raised on Google believes they could pepper their query with the names of propellants and launchers and then truncate on a few choice biological weapons. What’s misinformed about that? Nothing if you’re on the web. However if it’s done on your firm’s SharePoint server and rockets are not what you sell and maintain then you run into two walls right away:

1. Complex question +
2. Uncommon terms =
3. Dumb question

Of course the site admin who sees it is no likely point this out than the search tool itself. Can you imagine buying the Google appliance and for every “zero hit” set of search results the response is “Did you mean to search this on public Google?” The problem metaphorically is that Rocket Star is sticking to his guns by running an ocean-sized search request inside the information pond that is my intranet.

Here’s a QA framework I developed that illustrates the response range in terms of the battles worth fighting (stay with the upper quadrants):


Short of remedial information literacy classes the best work-around is to focus on the use of one or two unique terms so that my user can see the lay of the Rocket land in my shop before plundering ahead with anything more esoteric or complex. I can also engineer a search outcome that breaks the question down in terms of the topic addressed. But that works best for blank, receptive brains — not for domain experts.

Ultimately the best run around the no bad questions mindset is to connect people and dispense with relevancy scoring for documents. Once we’re past that we can actually prove what a good question can be. But only by providing a sound answer and people deliver those better than PowerPoints.