"What about trace()?". Idea behind the package • shinybreakpoint

I. The Story

The Myth

In his talk given at 2016 Shiny Developer Conference, Jonathan McPherson noticed:

If you’re a seasoned R programmer, you may have used the trace() function to add tracing without modifying your script. Unfortunately, it’s not possible to use this utility (or any that depend on it, such as setBreakpoint) with Shiny. trace() works by rewriting the body of the function to be traced, so the function must already exist when you run it. Shiny generates functions at runtime that aren’t easily addressable.

It was a fragment titled “What about trace()?” and is now available as a official documentation for Shiny¹. And we could say it is somehow suspicious, since in the same paragraph it is said that something is impossible, because some other thing is not easily addressable, but not - as we could expect - impossible to be addressable. It’s a good paragraph to start with, even that by saying that everything started, because of this suspicion, would be nothing more than founding myth.

Where Are You, Objects?

Our aim will be the modification of the body of function to insert browser() there, but at first we need to find all functions. We can do this using lapply(rlang::env_parents(rlang::current_env()), names) in the example below:

app.R

library(shiny)

ui <- fluidPage(
  modUI("mod"),
  textOutput("env")
)

server <- function(input, output, session) {
  modServer("mod")
  
  output$env <- renderPrint({
    lapply(rlang::env_parents(rlang::current_env()), names)
  })
}

shinyApp(ui, server)

R/mod.R

modUI <- function(id) {
  ns <- NS(id)
  tagList(
    numericInput(ns("num"), "Num", 0)
  )
}

modServer <- function(id) {
  moduleServer(
    id,
    function(input, output, session) {
    }
  )
}

The code above iterates over the environments, starting from the current environment, and move through the parent environments, displaying objects and environments as well.

By running this, we will learn that the structure of objects in the Shiny app have the following characteristic:

on the top (or: as the parent) is global environment, but in our example it doesn’t contain any objects created by us
next, going from the top to the bottom (i.e. to the first child) we will find the objects from our module (saved in R/mod.R file) - UI and server part (but named as modUI and modServer)
finally, as the grandchild we see the objects from the file where we have started - server and ui from app.R file (we can ignore the rest of the objects and environments)
some environments are named (like global environment) and some of them are not

Graphically this structure looks like this:

"global": .Random.seed
  └── "": modServer, modUI
      └── "": server, ui
          └── "": input, output, session
                                      and so on...

This is, of course, the evidence that documentation was right:

all R code in a Shiny app is run in the global environment or a child of it²

Now, because we were able to find the objects (and our search was very precise - we also get the environments where these objects live), we can try to modify them.

And Where Are Your Elements?

We want to modify body of function and we will use body() for this. Not trying to be precise, but just useful for our purposes, we can say that body() returns very complex list (although it is not of type list, but language).

server <- function(input, output) {
  shiny::observe({
    x <- 2
    x
  })
}

body(server)
#> {
#>     shiny::observe({
#>         x <- 2
#>         x
#>     })
#> }

body(server)[[2]][[2]][[2]]
#> x <- 2

body(server)[[2]][[2]][[2]][[1]]
#> `<-`

This list can be manipulated - elements can be added, removed etc. We just need some additional functions for this, like as.list() (so we can use append() to add elements to the list), as.call() (to turn back the list into language object) or quote() to not evaluate browser():

server <- function(input, output) {
  shiny::observe({
    x <- 2
    x
  })
}

body(server)[[c(2, 2)]]
#> {
#>     x <- 2
#>     x
#> }

body(server)[[c(2, 2)]] <- as.call(append(as.list(body(server)[[c(2, 2)]]), quote(browser()), 1))
body(server)[[c(2, 2)]]
#> {
#>     browser()
#>     x <- 2
#>     x
#> }

now the breakpoint is set after the first element of body (and the first element is a bracket). Let’s do the same in the complete Shiny app:

app.R

library(shiny)
library(magrittr)

ui <- fluidPage(
  modUI("mod"),
  textOutput("env"),
  actionButton("browser", "Add browser")
)

server <- function(input, output, session) {
  modServer("mod")
  
  output$env <- renderPrint({
    mod_env <- environment(modServer) # get the environment where `modServer` is defined
    as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]])
  })
  
  observe({
    mod_env <- environment(modServer)
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
  }) %>% 
    bindEvent(input$browser)
}

shinyApp(ui, server)

R/mod.R

modUI <- function(id) {
  ns <- NS(id)
  tagList(
    numericInput(ns("num"), "Num", 0)
  )
}

modServer <- function(id) {
  moduleServer(
    id,
    function(input, output, session) {
      observe({
        input$num
      })
    }
  )
}

We can notice that when Add browser button is pushed, nothing happens. However, it is not true - browser() is added, but we didn’t yet run the function. In other words, we have added browser() and have not used it yet. We can use another example to show this clearly:

fun1 <- function() {
  x <- 2
  x
}

body(fun1) <- as.call(append(as.list(body(fun1)), quote(browser()), 2))
body(fun1)
#> {
#>     x <- 2
#>     browser()
#>     x
#> }

How to know if browser() really works? It’s a tricky question, because it suggests that we can know if something will be run before actually running this. Instead of trying to answer this, let’s just run the function: fun1(). We will end up in the debug mode.

We can achieve the same effect in Shiny app by refreshing the session - by hand, clicking on the refresh button in web browser or by adding getDefaultReactiveDomain()$reload() to reload the session (i.e. by using reload() method on session object, so it could be also session$reload()):

app.R

library(shiny)
library(magrittr)

ui <- fluidPage(
  modUI("mod"),
  textOutput("env"),
  actionButton("browser", "Add browser")
)

server <- function(input, output, session) {
  modServer("mod")
  
  output$env <- renderPrint({
    mod_env <- environment(modServer)
    as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]])
  })
  
  observe({
    mod_env <- environment(modServer)
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
    getDefaultReactiveDomain()$reload() # added
  }) %>% 
    bindEvent(input$browser)
}

shinyApp(ui, server)

Doing this, we re-run the modServer() function.

Find Them Again (And Again)

Among other (not yet mentioned) problems, we still need a tools to find where to insert browser(), because until now we have just used our eyes. This is somehow secondary issue as we mainly would like to show in this vignette the core concept of setting breakpoint, even if by hand, so solutions below are rather only signaled.

Having a (named) function, which is in a saved file (which was source()d, so the srcref attribute is present), we can use the following functions:

utils::getParseData() to get the data.frame needed to reconstruct the whole script in this saved file (utils::getParseText() can be useful in this context as well)
utils::getSrcFilename() to get the name of the file
utils::findLineNum() to get the (1) name of the object; (2) environment where this object lives; (3) location in the body of this object based on line number in the script (i.e. which indices to use to get the right location when using body()[[]])

However, we quickly find out that after modifying the body of function, we broke it up - utils::findLineNum() no longer returns the correct indices (at element of returned list is broken):

app.R

library(shiny)
library(magrittr)

ui <- fluidPage(
  modUI("mod"),
  textOutput("env"),
  actionButton("browser", "Add browser")
)

server <- function(input, output, session) {
  modServer("mod")
  
  output$env <- renderPrint({
    findLineNum("R/mod.R", 13)[[1]]$at # check how this changed after the first use of button 'Add browser'
  })
  
  observe({
    mod_env <- environment(modServer)
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
    getDefaultReactiveDomain()$reload()
  }) %>% 
    bindEvent(input$browser)
}

shinyApp(ui, server)

R/mod.R

modUI <- function(id) {
  ns <- NS(id)
  tagList(
    numericInput(ns("num"), "Num", 0)
  )
}

modServer <- function(id) {
  moduleServer(
    id,
    function(input, output, session) {
      observe({
        input$num
      })
    }
  )
}

In the example above, we display the at element of returned list - when the Add browser button is pushed by the first time (and after that the debug mode is closed by pressing c or f), we can see that the displayed indices changed and we no longer can use this element to find the correct location in the body of function. In other words, our code now works just one time - after the first use, the app should be closed and run again (if at would be use to determine the location of body, of course; in the example above we are still using hard-coded location). To resolve this problem, we need to retrieve the original body of object (function) - one way to do this would be to use parse(). We already know the name of file (and full path as well) from which comes the function, so it is possible to parse the file and assign the original body of function to the broken one, before the browser() is added.

And Forget Them

All of our actions taken until now led us to the state of greed; we find objects, modify them and never undo the changes; and then we modify them again, again, adding more browser()s. This can be seen in our previous example - each time the Add browser button is pushed, next browser() is added at the top of the previous one. Now this greed leads us into trouble:

we need to escape (using c or f) from debug mode multiple times before we escape for real - each browser() nests us into debug mode
but what is much worse - the browser() persists and each time the observe() runs (and it runs every time the input$num changes), we end up in the debug mode

The remedy for this would be to remove the browser() after each use, i.e. after pressing c or f (after closing) in the debug mode. But it must be done within the function which is debugged (because of the session reload). In other words, the function will remove its own elements. Here is an exemplification of this mechanism:

fun2 <- function() {
  2
  env <- environment(fun2) # get the environment where `fun2` is defined
  body(env$fun2) <- body(env$fun2)[-c(3, 4)]
}

body(fun2)
#> {
#>     2
#>     env <- environment(fun2)
#>     body(env$fun2) <- body(env$fun2)[-c(3, 4)]
#> }

fun2()

body(fun2)
#> {
#>     2
#> }

We can see above how to include the code which removes itself; this code can be added to the function using other function as well (in the example above we have just added it at the beginning, by hand). In case of Shiny app, we need to remember that another session reload will be necessary as an equivalent of function call (equivalent of fun2()). Below is an example of Shiny app with a one-time breakpoint.

app.R

library(shiny)
library(magrittr)

ui <- fluidPage(
  modUI("mod"),
  actionButton("browser", "Add browser"),
)

server <- function(input, output, session) {
  modServer("mod")
  
  observe({
    mod_env <- environment(modServer)
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1))
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2))
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(body(env$modServer)[[2]][[3]][[3]][[2]][[2]] <- body(env$modServer)[[2]][[3]][[3]][[2]][[2]][-c(2, 3, 4, 5)]), 3))
    body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(getDefaultReactiveDomain()$reload()), 4))
    getDefaultReactiveDomain()$reload()
  }) %>% 
    bindEvent(input$browser)
}

shinyApp(ui, server)

R/mod.R

modUI <- function(id) {
  ns <- NS(id)
  tagList(
    numericInput(ns("num"), "Num", 0)
  )
}

modServer <- function(id) {
  moduleServer(
    id,
    function(input, output, session) {
      observe({
        input$num
      })
    }
  )
}

This example looks quite complex now, so it may be useful to describe (even if repeating something) crucial steps:

mod_env <- environment(modServer) - we are just getting the environment where modServer() function is defined - it will be later explained, why we use this
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(browser()), 1)) - browser is added
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2)) - to the function we are adding the code responsible to get the environment where this function is defined (i.e. function in which we are setting the breakpoint)
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(body(env$modServer)[[2]][[3]][[3]][[2]][[2]] <- body(env$modServer)[[2]][[3]][[3]][[2]][[2]][-c(2, 3, 4, 5)]), 3)) - we are adding to the function the code responsible to remove the added code from the function
body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(getDefaultReactiveDomain()$reload()), 4)) - we are adding the session reload to the function (session will be refreshed after closing debug mode)
getDefaultReactiveDomain()$reload() - we are reloading the session (after breakpoint is set)

The very interesting and useful thing is that the line of code can remove the next line of code, but at the same time this next line of code can still be executed - we are using this to reload the session after the added code was removed and to remove the code responsible for this reload. Perhaps it will be clearer to see it, taking as an example regular function:

fun2 <- function() {
  2
  env <- environment(fun2)
  body(env$fun2) <- body(env$fun2)[-c(3, 4, 5)]
  "two"
}

body(fun2)
#> {
#>     2
#>     env <- environment(fun2)
#>     body(env$fun2) <- body(env$fun2)[-c(3, 4, 5)]
#>     "two"
#> }

fun2()
#> [1] "two"

body(fun2)
#> {
#>     2
#> }

When we run the function fun2 first time, the last element (string "two"), which we are removing, is returned, so when run first time, the function will execute all lines of code, but after this we can check that the body has changed and if we would run the function again, only the number 2 would be returned. This is very important, because it means that we can add the session reload to the function and this session reload will be run (so the changes can be applied) and then removed from the function.

It should be noted here, that even if it looks like we are retrieving the original body of function (because added elements has been removed), this is not an answer for the (mentioned earlier) problem with element at in list returned by utils::findLineNum() function - at will still contain the wrong indices.

“Where Are You, Objects?”

There is one last thing to talk about - getting the (right) environment. In all examples of Shiny app presented until now, we have used environment() to get the environment of object (function), but we have did it also in the example of regular function to remove (within the function) elements of this function. These examples personify two aspects important for us- why it is necessary to get the environment where object is defined and why it is necessary to be precise.

The code below should help us answer, why we need the environment:

a <- 4
fun2 <- function() {
  a <- 2
}
fun2()
a
#> [1] 4

a <- 4
fun2 <- function() {
  env <- .GlobalEnv
  env$a <- 2
}
fun2()
a
#> [1] 2

If we stop thinking about this problem in terms of self-referencing, it looks simple - we can’t modify the object defined in the parent from the child environment, because of the isolation (function closures the environment), however we can do this if we explicitly refer to the environment where the object if defined. Now we just need to remind ourselves that the function is defined in different environment than environment where exist function’s elements- is defined in the parent environment. Thus to modify the function (from that function) we need to refer explicitly to the environment where this function is defined - this is not different from our example with the a variable above.

That was the answer, why we needed to refer to the environment inside the function, but why we needed this also in the observe at the beginning, where we set the breakpoint? On the one hand, it was indeed unnecessary - we could easily refer to the object directly, but (on the other hand) this is not safe - what if there would be a functions having the same name, but in different environments? How to refer to this function, not that function? environment() returns the environment for the nearest function and it may be not enough for our purposes.

env_inner <- ""
env_outer <- ""

fun3 <- function() {
  fun3 <- function() {
    fun3 <- function() {
      env_inner <<- environment(fun3)
    }
    fun3()
  }
  fun3()
  env_outer <<- environment(fun3)
}

fun3()

env_inner
#> <environment: 0x0000000022e49560>

env_outer
#> <environment: 0x0000000022e49678>

We see three functions, all of them have the same name and all of them live in different environment. We already know that the function we want to refer to is in an upper environment (because we are inside this function), but if we are not sure if it is a direct parent, we can end up having the reference to the wrong object, because - as already said - environment() gives us environment for the nearest object of chosen name.

In our previous examples we have used environment(), but that was just for simplicity - we see now that it wasn’t a great idea. shinybreakpoint uses something different - function which searches through the environments to get the needed environment.

To be able to refer precise to given function, it is needed to know the environment. Relatively easy would be to connect object with environment upfront, when searching for objects across the environments using code mentioned at the beginning, i.e. lapply(rlang::env_parents(rlang::current_env()), names). We could at first use rlang::env_parents(rlang::current_env()) to get the list of environments and then get the objects as a list, using lapply() and names(). This way we would have two lists of the same length and every element will point to the same environment, but the problem will occur when trying to use this environment in the code added along with browser(). Coming back to the Shiny app, we have used this line of code before:

body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]] <- as.call(append(as.list(body(mod_env$modServer)[[2]][[3]][[3]][[2]][[2]]), quote(env <- environment(modServer)), 2))

We are constructing the env variable which is an environment, but how to know if this is the environment which contains modServer which we are looking for and not the other function of the same name, but in the different environment (like showed before)? So we know that it is possible to have the list of environments and objects (functions) which belong to the given environment, but how to use this knowledge? This solution below won’t work:

fun3 <- function() {
  3
}
env <- environment(fun3) # global environment
call("assign", "envir", env)
#> assign("envir", <environment>)

# assign("envir", <environment>)
# Error: unexpected '<' in "assign("envir", <"

We have constructed the call (which is an assignment), using the variable env which is an environment where fun3 is defined, but this call is not correct - we see that it is impossible to treat the environment like regular value - we can’t simply evaluate the variable to get the value.

But the idea is not bad - we already have the object (fun3) and the environment (env), so we could just add to the body of function the code which will bind environment to the name envir. This way we would have the correct environment to refer to it, but as we see, instead of get something like assign("envir", .GlobalEnv), we have the incorrect syntax: assign("envir", <environment>). We can, however, keep the label of environment we need and search for the correct environment using this label. Label is the name of environment (if named) or address in memory - to get the label, we use rlang::env_label().

fun3 <- function() {
  3
}
env_lab <- rlang::env_label(environment(fun3))

get_envir <- function(label) {
  envirs <- rlang::env_parents(rlang::current_env())
  names(envirs) <- lapply(envirs, rlang::env_label)
  envirs[[label]]
}

call("assign", "envir", call("get_envir", env_lab))
#> assign("envir", get_envir("global"))

The expression assign("envir", get_envir("global")) can now be added along with browser() to get the right environment - we know that it is (in the example above) "global" environment and our search (function get_envir()) will give us precise environment - the labels are unique. Of course, we still should think how to make sure that get_envir() will be visible in the environment (function) where this code will be added. In case of Shiny app it can be stored in global environment (all objects from global environment are visible in any environment in the app, because it is a top environment). In case of shinybreakpoint, because this function lives in the {shnybreakpoint} environment (namespace), access to this function is not a problem as well (if package is installed).

II. Closure

Setting breakpoint is a technique which is based on source code understanding as a commands executed line-by-line. This is how the source code is visible by humans - it is stored like a text and we all used to read the text knowing that the next statement is on the left, right, down. Functions make it a little more complicated - function definition can be far, far away from the line where this function is called, so to know, why we received the results we see, it may be necessary to jump in the different place in the file to check the function body (or set breakpoint). Reactive programming makes it even more complicated - it is not so easy to know, which code block has been executed when some input changed.

The idea of storing functions, grouped functions or modules in separate files is a try to give a source code more intuitive structure and this is a point where difference between text being a story and text being a code becomes tangible. We could even start thinking that affinity of code to the text is a ballast. Isn’t true that we split source code into multiple files because we want to be able to read it as a text (story) as easily as possible and not as a fragmented text (story)? And don’t we feel that setting breakpoint is now the same as selecting a text fragment in a book even if we are not writing or reading a book? Of course, someone could say that we can use digital tools to search a text fragment in a source code or a book, but there is one more important difference between story and a code - story is executed in our brains, but code is not. Do we really need to expose ourselves on all of this source code written as text, on each line? And even if in some smaller file - do we still need to always see all of these characters?

This last chapter is not a summary, but is about the idea that perhaps it would be useful to not display the source code, but relations between code blocks which belong to the reactive context to minimize the exposition of developer to the text as far as possible; and using the fact that code was executed in the machine, not brain, it could be useful to display the last relation, so the breakpoint can be set somewhere in the chain relevant for output or side effect. All of this is of course not new, but because is still a matter of future, should be reminded in this document.