Transcript#

This transcript was generated automatically and may contain errors.

Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12pm US Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.

Okay, I'm so excited to introduce our featured leader today, Martin Frigaard. Martin is a senior Shiny application developer, and he's here to tell us all about his experience as a shiny app developer and his journey. So get ready for all the questions about all things shiny. Martin, would you introduce yourself, tell us a little bit about what you do, and something you like to do for fun.

Okay, great. Yeah, so I'm Martin Frigaard. I just want to say first off, I'm appearing on the hangout as myself. All the views and opinions are my own and do not represent those of my employer. That being said, I'm an R programmer and shiny developer. What I do for fun, it sounds like a distant memory. I have two small children now, so most of what I do for fun involves what they want to do for fun. But before that, I was a pretty avid, I spent a lot of time in my 20s and 30s doing, I did various, I did a fair amount of personal Jiu Jitsu. I did a little bit of weightlifting and some other gym oriented stuff. And then I did for a small period of time, I was, I like to consider myself an amateur journalist. So I spent a fair amount of time kind of in the journalism world. In fact, my first kind of start with R was writing for Northeastern University's journalism school had StoryMatch was their blog. So my first kind of like R tutorials way back in like 2015 or 16 was writing for a journalism school. So I still mostly just now write. I write things to myself, it seems like more and more. They start out as little notes and turn into long essays where I'm lecturing myself on advice that I don't take.

Becoming a Shiny developer

Well, we have a question that we can jump in and ask right away to get us started. And it is from Nathan, who says, how does one become a Shiny developer? I did not know that was a job title. Yeah, it wasn't. I think it was like broadly app developer, but now I will actually have recruiters and you'll see them list Shiny app developer because it's obviously become big enough to become its own thing.

I think that the path is probably, it's going to be more R programming. I really recommend spending some time with a little bit of time in package development. So kind of once you get started with the basics of R and syntax, a little bit of knowledge on package development will make you a better Shiny app developer. And then, yeah, if you're just getting started, I would really try to be as bilingual as possible between Shiny for Python and regular Shiny. I really like everything. Every time I pick up Shiny for Python, I'm impressed at the things it can do. And it seems like the crossover between the two, you would kind of want some of that flexibility.

I think that the package versus Shiny app distinction is not clear for a lot of people about why that's important. A lot of Shiny apps are actually developed as packages. Could you talk a little bit about your experience with that? Yeah, that's why I wrote a book called Shiny App Packaging.

Shiny app packages and frameworks

So when I started writing the book, there was a lot of questions. At the time, I was working in the biotech pharma space, and there was a lot of adoption of Shiny and questions about, you know, should we use Gollum? So these frameworks were released. And Gollum is a fantastic framework out of the ThinkR group. They have a book that really walks you through, I think it's an engineering production grade Shiny app, so it's probably similar. But it's a really steep curve for package development. So there's a great book also adequately or aptly called R Packages. I think it's in its third edition now. That's a great overview of how to develop your R code into a package. The R reference manual has its own, I think it's called Writing R Extensions, that has a lot of that information, but really in a kind of like hard to find or navigate way. So R Packages has a bunch of great information on how to make your R code easier to download and install on any computer.

And so what I was missing in my role was, well, what if we don't want to adopt a framework? What if we just want to develop a Shiny app as an R package? And so the book was written just as a, you know, there was a gap in Shiny app development. And actually, this is like, I can go off on a tangent here. So you remember Joe Chang was going to write the Shiny book. And Joe Chang has said this on stage, so I don't feel bad throwing him under the bus here. Joe Chang didn't write the Shiny book, but Hadley Wickham wrote the Shiny book. And Hadley Wickham, like packages are introduced in, I think it's like chapter 19 or 20 of Mastering Shiny. And it should be like, in my opinion, it should be like chapter two or three. Because so much of your development of a Shiny, of a good Shiny application, you know, will be driven by a lot of package development behaviors and tools. So things like loading the code, things like writing tests, that kind of stuff to make sure that your application, especially if you're going into a production environment.

And it should be like, in my opinion, it should be like chapter two or three. Because so much of your development of a Shiny, of a good Shiny application, you know, will be driven by a lot of package development behaviors and tools.

You know, it's Shiny app packages basically was, I think I even have like a Venn diagram. It was like, you know, there was this spot missing of, okay, I know how to develop an R package. There's all this great, like how to build a Shiny app, but like, how do I build an R package with a Shiny app that isn't in a framework? I think since there's been, when I wrote it, I think Rhino was like in its infancy. I think it's a little bit bigger now. And Rhino is its own animal, not just literally, but figuratively. It's not a package. So Rhino applications are not packages. They actually use, I don't want to get, people are familiar, they use Box, which is this kind of different, unique way of maintaining dependencies.

Rhino is more of a framework. It's an integrated framework. Yeah. It helps you use software engineering principles really more to do what you're doing with your Shiny app development. And being very, very precise in dependency management and other things. You know, I think that, well, it's been, I think it's been used pretty successfully in some of the FTA submissions. So I know that it's great for what it does. I think that what my advice always is for newcomers is that it's still worthwhile to learn how to write R packages. It's still worthwhile to learn how to develop Shiny applications. And then once you get to a point where you want to take out something like Rhino, having that framework to kind of compare and contrast will just make it easier to learn that.

So yeah, I think getting into Shiny, I think there's never been a better time to get into Shiny development because it seems like I see more and more. In fact, just this morning before the call, I was deep into this. Let me just do a shameless plug for this amazing newsletter from the RDM Weekly on Substack. It's a great newsletter and they had this fantastic app that is a readme builder. So let me just plug everybody else's tools here. Yeah, there's a readme builder Shiny app that I was just really fascinated by and was trying to resist the temptation to dig into before I talked because I didn't want to have a bunch of distractions.

But yeah, so the Shiny app space seems like it's getting larger every day. And Shiny for Python, like I said, I think is worth learning. I know that there's two syntaxes there. So it probably is personal preference on which one you want to learn. But yeah, it's equally as I would say impressive and probably will follow the same kind of adoption trend.

Okay. We're going to get a link in there to the RDM Weekly from Crystal Lewis. That's who it was. Crystal is amazing. And I wanted to hop in and just talk about what Shiny is for a second just in case anybody has not used it. I saw some questions that were kind of filtering through the chat about Shiny. And the questions were tending towards like, oh, it sounds like Shiny app development is more software engineering. But writing a Shiny app, you can write it in R. And that's sort of why it exists, right? So that if you are a data scientist, a data analyst, and you do most of your work in R, you don't have to learn JavaScript and you don't have to learn HTML in order to create something and deploy it that lets you have an interactive app for data science or data analytics purposes, which is amazing. It makes web development, web app development, much more accessible for somebody who writes mostly in R or mostly in Python as well. There's Shiny for Python.

And Shiny for Python has two different versions because there's an express version. And then there's a regular version, which is a lot more like the R syntax. So definitely go explore Shiny because we are making it sound like it's really, really heavy software engineering. But you can make a Shiny app in a few lines of code and have it run and kind of dip your toes in the water. It's really, really amazing and fun. So I put a link to Shiny, but you can Google Shiny apps and get to all kinds of information, including some really easy like get started a few lines of code, you'll be good to go. And there's a wonderful community online of Shiny people who would be happy to help you and answer your questions.

Learning Shiny: pet projects and resources

Okay, let's head back over to the Slido. Thank you Nathan for asking that great question to get us started. And I saw that Noor asked a question that got some replies. I would love to just put this out there so that anybody in the chat can reply and Martin, you can reply as well. You had already mentioned a few resources like your book. Noor, would you like to unmute and ask your question live?

Sure, hi, my name is Noor. Thank you for coming to the Hangout, Martin. I don't know why I just said that. Anyway, best way to learn Shiny or best projects to start with, resources are also helpful. I've seen a lot of resources in the chat. Basically, how do I become one of these happy, shiny people?

I think that, yeah, well, you can, you can be a happy, shiny person right now. Just decide to be, and you seem pretty happy already. But the, no, the thing that I am a big fan of, so I started building Shiny apps with pet projects, right? At the time that Shiny came out, Jesus is probably 2015 or 16. And it was, you know, it was still very early. I didn't, I didn't really know what I could do with it, but I, but I knew I wanted to, I was tired of like developing, you know, tables for, you know, research manuscripts and that kind of stuff. So just the idea of being able to put the output from an art program into something that was interactive was really exciting to me.

And what I really recommend to people, there's so many great books on, at the time, the reason I brought the data is at the time, I think analyzing baseball data was like, it was like the first edition, I think it was all written in Basar. It's since been updated. I think I have like a new third edition coming where it's been converted to tidy versus stuff. So I highly recommend that book if you're a baseball fan. But there's so many great books on, you know, doing our data analysis in a particular domain. And that like ranges from NFL to, I'm gonna skip this portrait now because that was like how I got into it. But, you know, there are so many, whatever your flavor of data that you find interesting, you can probably find a book written in R.

But so to answer your question, I had a pet project and I got into Shiny because there was like no pressure other than my own curiosity to get an application built. Once I had like a pretty decent understanding, I immediately advertised myself as a freelance, I think I just used freelance app developer, but like a freelance Shiny developer. And one of the first kind of contracts I got was something that was publicly available that I could share, was the American Diabetes Association asked me to build a Shiny application for a surveillance error grid analysis. It was like, I could probably find it, but the thing, the last time I checked, much now it's not going to be working, but like I built this thing in like 2017 and it still works. You still see the original like fluid page. It's by no means, I would say, I don't pull it up when it's like representative of my best work, because it's like the first app that I remember building as a freelance Shiny developer. But that app was only possible because I spent a bunch of time learning the inner workings with my pet project, right?

So there's a lot to be gained from a curiosity driven project where there isn't a like a spec sheet, you don't have deadlines, there aren't stakeholders that are telling you where to move things and how the UI should look and stuff. So you can really kind of go nuts with the, you know, Shiny's features and see, you know, kind of see what you can do, the added benefit doing that you're exploring data that you're interested in.

And I think that that is kind of the best way to learn it. You can definitely use mastering Shiny as a resource and all the Shiny reference, but you know, the thing that will really kind of like drive, I would say your ability, push your abilities as a developer is if you have data that you're interested in and want to explore, because that way, you know, you'll actually make something useful to you. And that's really, you know, at the end of the day, a Shiny developer, any app developer is trying to make something useful to the shareholders who take over to that app.

Advocating for Shiny over point-and-click tools

And I've, there's a question here that is anonymous, so I will ask it. It says, how do you get your team to switch to Shiny instead of Tableau? Tableau makes me sad. I will replace Tableau with like any point and click dashboard creation tool. Do you have any tips on advocating for Shiny as a tool? Or maybe like why it's difficult for leadership to wrap their head around using an open source tool that's code first like Shiny?

Yeah, I think the code first is probably the biggest hurdle. Yeah, so getting people to write code who don't write code is, you know, it's, I think you have to really get them to kind of build on, you know, getting them to build on successes. I think that, you know, we use the phrase a lot, like let them eat cake first, right? You know, I think that part of the reason that Tableau, Power BI, all these, you know, these point and click kind of GUI interfaces are so popular is because, you know, for the majority of people using them, they not only have attached in their mind a certain like expediency and a timeline in which these things are built, but they have a mental model for how, you know, like if I want to build a dashboard or I want to, you know, with Tableau, it's, you know, I want to build this thing, I want to have maximum kind of like, you know, what they perceive is I want to be able to customize it. And then I want to be able to share it quickly with whoever's asked for it, right?

And I think that for them, the biggest hurdle is I'm going to go from a mental model and process and timeline that I'm vaguely, you know, comfortable predicting outcomes with to, I'm going to go learn a syntax and a language in which, you know, there's just a total uncertainty on how quickly they'll be able to develop, to deliver and develop the same product, you know, products, the same outcomes. So I think that it kind of takes baby steps in getting them comfortable with doing anything in code versus point and click. And once they can kind of see the ecosystem around, like why we use code, you know, the fact that you can be version controlled, the fact that you can collaborate with it, the fact that, you know, nothing, no GUI is ever going to be a full language, right?

Because language gives us, I'm going to forget what Willem Voeltz said about language, that it's infinite, infinite, you know, kind of outcomes with finite, you know, objects. So it's the fact that a language gives you the ability to literally create kind of anything with endless combinations of its, you know, verbs and grammar. And that's why, you know, no matter how many things you put into a GUI interface, you're still, it's, you have two clicks with a mouse, right? That's all you can do is click or not, right? And the fact that you have a language, like ggplot2 is a perfect example of, not only is it a full grammar of how to build beautiful visualizations, but the more you use ggplot2, the more you can pick up any graph or look at any chart and understand it better, because you have a language of understanding graphs that you didn't have prior to programming with that.

And, you know, computer programming languages are the same way, you know, when it comes, R has its origin in statistics and it isn't long before you're doing statistics with R that, you know, the programming in many ways complements your statistical knowledge, right? So the two kind of go hand in hand. And I think that when it comes to getting people to switch over to a syntax based tool, it's just them kind of unlocking the ability of the language versus, you know, kind of resisting that, you know, that point and click, which like I said, most of the time it's based around, you know, whatever, wherever you're working, that organization has timelines and expectations for, you know, I want a dashboard and I want to, you know, with these particular features and the person who's delivering it in Power BI or Tableau, like I said, they can confidently deliver something within that timeline. And so the opportunity cost of giving up a known tool with timeline, even though it doesn't have some of the features that we know that like Shiny would have, you know, it's going to an unknown, both with the syntax and with the actual application.

Shiny modules explained

Awesome. Thank you. And I wanted to call out something that Renato had said in the chat. He says, you don't even have to deploy a Shiny app for it to be useful. You can hit play locally and have it do things for you with the push of a button. This is often how I use Shiny apps. I'm like, I need to do a thing that would be so much quicker if I could interactively be fed things and click through them, right? Like I could verify something, put it in one category or the other. It's easier for me to do that with my eyeballs than to go through and go through like a list or put it in a spreadsheet. So I'll just build something that I will literally never use again, but it will be useful for me in the moment.

Well, I have a question here from Arsenis in Slido. Arsenis, would you like to ask your question live? Yeah, sure, thanks. And Martin, thanks for taking the time and chatting with us today. I've basically decided to come in with a bit of a technical question. As a fellow lover of Shiny or somebody who enjoys kind of playing with it and doing stuff, I have a question about modules. So when I write code in R, which is what I write primarily in, I love writing functions. I love vectorizing things, generalizing my code as much as I can, but I can't seem to wrap my mind around modules even though I do have a pretty solid basic understanding of reactivity. So I'm wondering if you might explain modules like I'm five.

Yeah, modules are to your Shiny app what your folder directory is like in your operating system. So the best way to think about modules is that when you create another namespace, it's really not that complicated. It just appends the name of that namespace into the applications like reactive model. So when you, the simplest is like, if you have some input module and then some output module that has a graph, right? And so you're collecting like three dropdown inputs with this input module and you're shooting it over to your graphing function and at the output.

Namespace collision is what you're trying to avoid with modules, right, because it's all one namespace with just an app.r file. And when you append a namespace that is just literally, it's like, what's the string function where you're just you're pasting together the name of the namespace into the output, right? So your input IDs, you know, let's say you have input x, y, and z, and you have a module that's inputs becomes inputs dash x input that y input stacked z.

And the reason that you have these, you know, these appended namespaces, it's back to the folder and directory is you can't have two folders with the same name in the same directory, right? You can't in your root directory. You can't have like two homes. You can't have two folders with the same name in the same space. The same thing with namespace collision, right? If you have inputs and you are like me, you don't have original names for all of your inputs, you know, you will eventually have namespace collision because the namespace is shared without modules throughout the whole application.

By adding the namespace function, and then the module, the two functions work together. Namespace just really appends the name of the module in the UI. And then in the server, you see that as like, it has the same syntax that you would have in a regular server function, right? But it's just reading from the UI from that particular module, that particular input. So it's really just appending a bunch of like, you know, prefixes to your inputs. So that way you don't have any namespace collision.

I think it's like naming your children different things. They all have the same last name. Yeah, yeah, exactly. I understand the namespace thing, but what is a module? A module is just a small piece of interactive code, usually accompanied with, it's like a mini UI server combination, right? Yeah, yeah. Self-contained, self-contained. So it's like a Lego piece. Like if you're building a Lego wall.

But those two things function like a regular UI and server, but just on a smaller, more modular level, right? And you can nest modules, but all that does, again, is just like another subfolder, right? Or in Libby's example, it's just a middle name, right? You're further identifying that particular input. So you're basically creating a small container that has both client and server reactivity contained within it. Yes. And then, because of the namespace thing, it's then reusable. Yes, yes.

You'll have your inputs that can then be just read out of your module and then fit into your outputs module. That's the piece I was missing, is that it's both in one thing. Yes, you have a UI and server component in each module that function just like your regular, when you build an app.r file, just your standard boilerplate, that function the same way a UI and server call work there. You're just basically making it much more, the biggest benefit to using them is just that it mentally chunks a component of the app into, like, I'm now only dealing with these inputs and how they're returned. Or how you design it, it can be like, I'm only dealing with this input and the output it creates, right? Really, module design is kind of up to the individual programmer and the patterns are just preference. But the simplest example that I always think of is you have a module for inputs, a module for outputs. The input module will return all the outputs. And then in the main server, you'll collect those inputs and pass them into the server of the output module.

Yeah, and what's great is that you, it's just like writing functions, right? Like, you don't want to repeat yourself. You don't want to write the same piece of code for your UI and Shiny over and over again. And you're like, oh, well, I'm doing the same thing on this other tab that I did over there twice. Just make it a module. Use that same module and just change the input or output that you need. And like Martin said, like, collect those inputs and outputs in whatever other module you're using for the output. But yeah, I think that this is a great conversation. If you're interested in modules, there are a couple of great talks by Deepsha Mangani that are recorded that are out there. Look up modules, Deepsha Mangani. And I will try to get some links in the chat for those as well. She does a great job of helping to explain.

I would say, too, that this is like a, I always, when you're debugging modules, because the first thing that you're going to wonder, like, what, okay, so what is now, what does this look like floating around in my namespace? There's a really handy tool, and I do this all the time when I'm dealing with an application that has a lot of modules or nested modules. There is a reactive values to list function in Shiny. And what you can do is you can just drop that into, like, a render print, and then a verbatim text output. Just anywhere in your UI. You know, usually at the top level. So, like, right at the top of the app.ui. Or app.r. Sorry, ui.r. I forgot we don't do that anymore. It's all app.r now. It's in your UI. You're going to just drop a verbatim text output, and then somewhere down the server, just do a render print of the reactive values to list. Just assign it to an object, spit it out in the UI. And what that does is it actually gives you a very clear idea of, like, how the namespacing works. Because you'll see those appended names of the module to the actual input ID or output ID. So, you can actually see all of your reactives in real time without, like, shutting down your app or trying to use the debugger. You can just basically spit out the reactive values right in the UI and look at them as you're changing them and moving around.

Posit wish list and AI tooling

We had a question that came through that was anonymous that says, I'm curious to know what's on your Posit wish list. As a person who uses a lot of Shiny, is there anything that you wish that it could do that it doesn't do or that any of the products you use could do? Yeah. I actually erased my doodles because I'm insecure about my artistic talent or lack thereof.

I was wondering, so that's my questions for, like, Posit developers that are building stuff that I'm using. So my question was related to the great series of packages actually written by Simon Couch, who wrote, it was originally PAL, and now I think it's Chores, Gander. Yeah. And he didn't, oh, and then I think Insure is the other one. So he has a collection of LLM tools. And, yeah, I think there was actually just a bunch of chicken scratches and then a big question of, like, what is this? Because I was wondering whether, like, Chores could be classified as, is it a skill? Is it a tool? You know, how would you, like, in kind of, like, plot-anthropic parlance, you know, how would you categorize what's being created with a Chores helper? And he answered that. So, yeah, I was comfortable deleting it along with my artwork.

But, yeah, usually that's, so the wish list is always something that I'm using, you know, either, like, you know, within Positron or RStudio that I have questions about, or, you know, is this coming soon? The biggest thing that you guys actually did release was the Assistant pane in the RStudio IDE. You know, I was wondering when that was going to be released, you know, some kind of equivalent to Positron Assistant. And now that's out. And I think that's, is that shipping the preview or is that in the regular RStudio download now? That's a good question because it was just recently, but I think that it is available. And that is Posit Assistant. Yes. As opposed to Positron. I know all the names are so similar.

We all just blame Joe Chang and he takes the responsibility. And then there's like Posit AI. And now I have this other word that just fits in there. But there's Posit Assistant, which works via Posit AI subscription into RStudio IDE. And then there's Positron Assistant, which is configured into Positron. Not to be confused with Databot. Not to be confused with Databot. And, you know, I think that all of these things will shake themselves out. The thing is, innovation with AI happens so quickly that things change. You know, in six months, everything will be completely different. So I think naming conventions have been a difficult thing. Naming things is the hardest part of data science. We all know this, right? So we forgive Joe Chang. All the wonderful things that he's inventing. The naming is hard.

Well, I think that Chores is amazing. I stuck up a link to Chores and also pasted the hex image in there. Because I think that little potato is just so cute. I think of him as Pal still because it was named Pal to begin with. But I also have to do, because I brought it up, I have to do a shameless plug for the btw package. Because it's, I think Garrick is working on that. Yes, Garrick Eaton Bowie. Yes. And that actually is just such a great reading through the documentation on the btw package itself. Not just using it in your development, but it's like a master class in skills and tools. It's a really, really great package. The source code for it is amazing.

Journalism, career path, and communication

Renato says, how do your journalism interests interact with your Shiny work?

Yeah, so this is just more about my personal path into this field, which everyone wants. Once they get into a position, it always sounds very linear when people describe how they ended up. But I want to be the first to acknowledge that I fell into data science. I started off, and I promise this isn't a tangent. This goes back to journalism. I started off with, I was going to school for exercise science, so I thought I was going to be a sports medicine MD.

When I got out of undergrad, that was the great recession, the previous recession. So, you know, it was difficult to kind of like find somewhere to work while you're studying for the MCAT. So some professor who was great, you know, looking out for me said, you should jump into, you know, get your master's in extra science, because you know, you're already had a bachelor's and, you know, there's kind of nothing, nothing out there right now. And I took his advice. So that was the first one that wasn't my idea. And then the jobs that were available for a research assistant, that's a graduate student, one of them was a research associate, and I needed to get some, we had grants fostered projects that I worked on, and I need to get some R scripts tomorrow. So that was, I learned R out of necessity. You know, it wasn't a, as soon as, of course, I found the R community, then I loved it. But I didn't, you know, I didn't do like, what am I going to do with my career? I think I'm gonna get into data science decision. That was never a decision I made.

So fast forward to, I'm now wanting to go, you know, into some analytic field. And at the time, data science was a thing, but there weren't really that many programs. So this was like 2015 or so. And UCSF had a master's in clinical research in Berkeley, was like talking about a data science program. And I was in the Bay Area, same reason everybody goes there for, you know, I was going to work at a startup and, you know, go public and get rich. Didn't, never worked. But the, the thing is, you know, it was, it was more that I didn't want to wait around for a data science program. So I jumped into UCSF's clinical research. And that's really how I ended up in the biotech pharma space. Right. So it was like, you know, all of these were, they look like decisions on a CV, but it was external forces, kind of life batting me around into certain areas more than an active decision.

Journalism was, I had a friend who I met that I'll go ahead and name him by, he'd be great to have on this. Alayshu Bejjak is his name. He and I wrote together at the journalism story bench blog, and then he's gone on to work at USA Today. And he was at Urban Institute for a while. I'm not exactly sure where he is now. But the trade was, I was going to show him how to use R and he was going to improve my writing. Because if you've ever been in a graduate program, you spend a lot of time writing, you know, articles for peer reviewed research that, I hate to break your heart, like no one wants to read. They're all written in passive voice. They're all very dry. There's a reason why nobody other than other academics read it. So I went to him like desperately looking to improve my writing so people would actually want to read what I wrote.

And so I think that one of the, one of the things, you know, that one of the myths you're told is like, oh, the data speaks for itself. You just show them the data. And I think that that is a, you know, journalism students and people that are skilled with communication know that, you know, you're, you're really trying to, you're working toward making sure your audience is engaged and that you're getting them information clearly and concisely. And that, that is something that I think when you're trained academically for writing, it's, you know, it's almost like volume equals quality. And I think that, you know, it's good to have somebody who really, especially nowadays, just in terms of time and economy, somebody that can help you get your writing down to a level where you can be clear and concise without, without, you know, without removing the important bits. And so journalism was just one of the ways that I really wanted to improve my communication, not just, you know, in writing, but because so much of what I do is technical writing. I wanted my technical writing to also be very clear and concise and written to where, you know, a wide audience could read it and understand what was out of it.

Amazing. You're, you're in good company. If you are not looking at the chat, there are a bunch of people going, I was in X field, fell into data science from there, right? Like nobody, nobody is like linear pathway. Yep. Straight to data science. We all come from different places. I see zoology. I see bio stats. I see biomedical science. There are so many pathways. And then I see one person, I think it was Gonzalo said, I tried to get into data science and then fell, fell into being an AI engineer. Sometimes the squiggly pathways might start at data science and go somewhere else.

Yeah, I think honestly, the most interesting, you know, people that did the most interesting career paths, it really isn't a like plan. It's this accumulation of unique skills that make them, you know, particularly well fit for a type of problem. And, you know, it's, it's that unique combination of skills that make you valuable to, you know, to any organization or any field. It's not, you know, checking boxes along the way. I mean, I think that, I also think that, you know, that there's a great book written about this called range by David Epstein, and it was like a followup to Malcolm Blackwell's 10,000 hours. Yeah. It's a phenomenal book that, that really points out that like, and we put such an emphasis, especially in technical fields on expertise. And, you know, I'm, I'm more of a fan of good enough than I am of expertise. I think that good enough, meaning good enough knowledge to get work done, get enough knowledge to get, to get what, to get where you need to be is so much more valuable than expertise, being an expert in something. We need experts. Absolutely. I'm not, I'm not denying any experts, but we don't need everyone to be an expert, right? We need more people that have good enough knowledge in a variety of fields. I really think, yeah, general knowledge, good enough knowledge. And I think this will just become increasingly true with, you know, all the technological advances or changes we see coming is that, you know, the, the, what I've seen thus far with the LLM tools is that your, your knowledge around the data science tech stack is going to be much more desirable than any particular programming language. You'll still need to know the language, but implementing these, you know, across, you know, the technology that is in your ecosystem that you work with is going to require knowledge of things outside just your programming language.

I'm, I'm more of a fan of good enough than I am of expertise. I think that good enough, meaning good enough knowledge to get work done, get enough knowledge to get, to get what, to get where you need to be is so much more valuable than expertise, being an expert in something.

Code quality and maintainability

Data analysts are increasingly being asked to use R and Shiny. Many can write code that works in the short term, but struggle to build code that stays reliable and maintainable as projects become more complex. So what advice would you give analysts who want to improve their code quality as they grow?

Yeah, I think this is just part of the nature of open source is that it is always fun to use the developer written packages from GitHub and that kind of stuff. And I mean, I've been just as frustrated as anybody when like your deep layer code suddenly doesn't work because of some change in the syntax. This is kind of the nature of open source tooling that these things evolve and change over time. And so stability, base R is more your friend than you realize. I think people underestimate how important base R is to doing a lot of, I would say just regular programming. I haven't had many problems with Shiny itself in terms of being unstable. They're really good about backwards compatibility. But I always explain to developers that I'm either supervising or working with it. But the more that you go out on that edge of this is under development, under active development, the further that you're going to push the boundary that it comes at a cost of probably additional maintenance. And I think that maintenance is actually, there used to be the data wrangling was the thing that nobody wanted to talk about for statisticians. Like how much time do you actually spend cleaning the data versus like analyzing the data? And it was like an 80-20. But nobody likes, I think the term data wrangler actually sounds pretty cool, but nobody wanted to go, everybody wanted to be a data scientist.

But no, I think that the maintenance of the applications that are developed in open source is just a part of the cost of doing business. And I think we parse that in enough. I think that it should be part of an expectation that if you're using something that is on, if you're using bleeding edge, cutting edge stuff, that's going to, it's like building a Ferrari. It's going to perform really well, but you're going to have to provide a lot of maintenance to make sure it performs. You can scale it back quite a bit and use only cram packages and increasingly depend on base R for stability if that is your primary concern. Those are trade-offs you have to accept with open source tools.

I will add to this by saying the hardest time I've ever had maintaining a Shiny app, because I have been a Shiny developer actually for a short time and I've done freelance work as a Shiny developer. The hardest job I ever have is when a Shiny app was built using non-Crayon packages that were not versioned and are just living on some random person's GitHub repo that made a package one time and then never touched it again. So my plea to the universe is, if you are building a production-grade Shiny app, you're deploying it for somebody, you're doing something for freelance work or otherwise, please, please, please try to stick to vetted and versioned packages that are on Crayon and that will stick around. Otherwise, people are not going to be able to restore or build back to your version, especially if they're built on our end.

No, that's exactly, yeah, that's really what I was getting at, was just that there are really great developer-built packages that are adding, I would say, quality and features to your application that I would say are essential, but because they're not on Crayon, they can also be one of those things that maybe contact that author, see what their plan is for that package. I don't know how many of them, but I do think that, you know, if you said, hey, I want to make sure this lasts in the future, I can't imagine a developer saying, no, by all means, don't commit my package to CRAN. You know, don't. You know, become a co-author of that package if it's valuable to you and make sure it ends up on CRAN.

Your problem is solved. Like Martin said, like, Base.R is so much more powerful than you think it is, and having fewer dependencies because you did something in Base.R instead of depending on a package to do something can be really, really powerful. You can cut down your dependencies. Yeah. I've definitely been guilty of, like, looking at a package and being like, oh, I can do that in Base.R. I'm just going to replace that function with a little bit of Base.R, and I still get the functionality without needing a dependency.

Bridging the gap between Shiny developers and IT

Okay, well, we have time for one more question because we have about six minutes left. Mike Smith, I believe you had a question. Okay, I just put it in the chat for everybody. It says, the Venn diagram of DS, data science programmer, Shiny app developer, and IT org app developer has a lot of overlap, but do you think there are any practices from IT org developer that we could learn from a data science programmer, Shiny app developer perspective? This is a great question for Martin because he actually spent years as a Posit admin, sort of in a behind-the-scenes role, but was still developing Shiny apps from that direction. So you've dealt with this from both sides.

Yeah, I will say I was frustrated because Shiny app developer, I built something in that I'd run into that wall of can't get into production, can't deploy it, right? So the DevOps kind of wall. And then I just decided, you know what? I wonder what the problem is. So then I went over to the Posit system admin side and realized it's much more complicated than I thought it was. Like most things. So I think that the biggest solution to this is exactly what Libby said, and that's understanding more of what IT does and really asking them to go through, to the degree that they will let you. Some organizations have a very high wall between IT and developers and others are, they're able to get in a call and talk through, but understanding how these services like Connect and Workbench and Package Manager are set up in your organization, understanding how they're configured, understanding what maintenance of the services looks like.

You know, one of the first things that I did as a Posit system admin, because I'm a Shiny app developer, was like, all right, obviously a huge part of this job is looking at log files. It's just a ton of like, you know, grepping on these log files. And Shiny has this amazing reactive file reader that I just built a really simple Shiny application that just, I could select the log file from the UI and just spit the log into the app. And then I could search for that instead of having to open the terminal. Right. So it's, and believe me, that's one of those things that like, unless they're really diehard, some people are diehard terminal users and you're never going to change your mind. But, you know, if you can see what IT deals with and what they do, I think a lot of times and approach it from a, hey, I just want to understand, not, you know, selfishly, I want to understand what the problem is so I can get my stuff deployed, but just understand the difficulties that they deal with from, you know, a services and architecture standpoint.

It'll help you when you're deploying to think about like, okay, what, what resources is this going to take up in terms of, you know, what's on the server, you know, what, what additional kind of like firewall issues, what networking issues might I run into with this application that, you know, IT is going to say no, you know, say no to or need approval for. You know, I think that just a little bit of understanding of the sysadmin role and responsibilities makes you a much better app developer just in terms of like that final 10% of getting it into production.

Yeah. And then, you know, I mean, of course, like taking skills back and forth. Yeah, I definitely have a different, a different understanding of deploying applications after having worked as a system admin and the dependencies and stuff that are required, you know, from a, you know, command line dependencies that they need to install so that I can use these R packages because the, you know, the tools are required on the server itself long before the R package will be able to access it.

Yeah. It helps you with the, why can't you just problem because, because nobody wants to hear, why can't you just X, Y, Z, because you are not seeing clearly their struggles. I have a, yeah, I have a basic rule about not ever asking. So anybody I work with, I just avoid a just question because I assume everyone I work with, all of the just questions have been tried and answered, right? So if I find myself thinking, why don't you just, I immediately, they've tried that. They've already done that. If it was a just question, it would have been tried, right? So you can come at it with curiosity from a different angle and learn instead of like accusing.

So anybody I work with, I just avoid a just question because I assume everyone I work with, all of the just questions have been tried and answered, right? So if I find myself thinking, why don't you just, I immediately, they've tried that. They've already done that. If it was a just question, it would have been tried, right?

Well, we have reached one minute to top the hour. So I feel like we, we must say goodbye. This was very, very fun. There are so many questions that didn't get answered in Slido. They're so good. There were two, there's one from Adam and one from Jackie that were both sort of about like testing, which I think were really great. Martin, I will get you the unanswered questions afterwards so that you can see them and, you know, you can feel free to, to answer them or not if you want in Slido, but this was really fun. I hope you had a good time. Thank you so much for coming. Of course. Yeah. And I will, yeah, I will answer them either in